Resizing char array not always working - c++

In a custom string class called Str I have a function c_str() that just returns the private member char* data as const char* c_str() const { return data; }. This works when called after I create a new Str but if I then overwrite the Str using cin, calling c_str() on it only sometimes works, but always works if I cin a bigger Str than the original.
Str b("this is b");
cout << b.c_str() << endl;
cin >> b;
cout << b.c_str() << endl;
Here the first b.c_str() works but if I attempt to change Str b to just 'b' on the cin >> b; line then it outputs 'b' + a bit of garbage. But if I try to change it to 'bb' it usually works, and if I change it to something longer than "this is b", it always works.
This is odd because my istream operator (which is friended) completely deallocates the Str and ends up allocating a new char array only 1 char larger for each char it reads in (just to see if it would work, it doesn't). So it seems like returning the array after reading in something else would return the new array that data is set it.
Relevant functions:
istream& operator>>(istream& is, Str& s) {
delete[] s.data;
s.data = nullptr;
s.length = s.limit = 0;
char c;
while (is.get(c) && isspace(c)) ;
if (is) {
do s.push_back(c);
while (is.get(c) && !isspace(c));
if (is)
is.unget();
}
return is;
}
void Str::push_back(char c) {
if (length == limit) {
++limit;
char* newData = new char[limit];
for (size_type i = 0; i != length; ++i)
newData[i] = data[i];
delete[] data;
data = newData;
}
data[length++] = c;
}
With push_back() like this, the array never has a capacity larger than what it holds, so I don't see how my c_str() could output any memory garbage.

Based on the push_back() in the question and the c_str() in the comment, there is no guarantee that the C-string returned from c_str() is null-terminated. Since a char const* doesn't know the length of the string without the null-terminator this is the source of the problem!
When allocating small memory objects you probably get back one of the small memory object previously used by you string class and that contains non-null characters, causing the printed character appear as if it is of what is the length to first null byte found. When allocating bigger chunks you seem to get back "fresh" memory which still contains null character, making the situation appear as if all is OK.
There are basically two ways to fix this problem:
Add a null-terminator before returning a char const* from c_str(). If you don't care multi-threading for now, this can be done in the c_str() function. In contexts where multi-threading matters it is probably a bad idea to make any mutations in const member functions as these would introduce data races. Thus, the C++ standard string classes add the null-terminator in one of the mutating operations.
Do not support a c_str() function at all but rather implement an output operator for your string class. This way, no null-termination is needed.

Related

How can I convert const char* to string and then back to char*?

I'm just starting c++ and am having difficulty understanding const char*. I'm trying to convert the input in the method to string, and then change the strings to add hyphens where I want and ultimately take that string and convert it back to char* to return. So far when I try this it gives me a bus error 10.
char* getHyphen(const char* input){
string vowels [12] = {"A","E","I","O","U","Y","a","e","i","o","u","y"};
//convert char* to string
string a;
int i = 0;
while(input != '\0'){
a += input[i];
input++;
i++;
}
//convert a string to char*
return NULL;
}
A: The std::string class has a constructor that takes a char const*, so you simply create an instance to do your conversion.
B: Instances of std::string have a c_str() member function that returns a char const* that you can use to convert back to char const*.
auto my_cstr = "Hello"; // A
std::string s(my_cstr); // A
// ... modify 's' ...
auto back_to_cstr = s.c_str(); // B
First of all, you don't need all of that code to construct a std::string from the input. You can just use:
string a(input);
As far as returning a new char*, you can use:
return strdup(a.c_str()); // strdup is a non-standard function but it
// can be easily implemented if necessary.
Make sure to deallocate the returned value.
It will be better to just return a std::string so the users of your function don't have to worry about memory allocation/deallocation.
std::string getHyphen(const char* input){
Don't use char*. Use std::string, like all other here are telling you. This will eliminate all such problems.
However, for the sake of completeness and because you want to understand the background, let's analyse what is going on.
while(input != '\0'){
You probably mean:
while(*input != '\0') {
Your code compares the input pointer itself to \0, i.e. it checks for a null-pointer, which is due to the unfortunate automatic conversion from a \0 char. If you tried to compare with, say, 'x' or 'a', then you would get a compilation error instead of runtime crashes.
You want to dereference the pointer via *input to get to the char pointed to.
a += input[i];
input++;
i++;
This will also not work. You increment the input pointer, yet with [i] you advance even further. For example, if input has been incremented three times, then input[3] will be the 7th character of the original array passed into the function, not the 4th one. This eventually results in undefined behaviour when you leave the bounds of the array. Undefined behaviour can also be the "bus error 10" you mention.
Replace with:
a += *input;
input++;
i++;
(Actually, now that i is not used any longer, you can remove it altogether.)
And let me repeat it once again: Do not use char*. Use std::string.
Change your function declaration from
char* getHyphen(const char* input)
to
auto hyphenated( string const& input )
-> string
and avoid all the problems of conversion to char const* and back.
That said, you can construct a std::string from a char_const* as follows:
string( "Blah" )
and you get back a temporary char const* by using the c_str method.
Do note that the result of c_str is only valid as long as the original string instance exists and is not modified. For example, applying c_str to a local string and returning that result, yields Undefined Behavior and is not a good idea. If you absolutely must return a char* or char const*, allocate an array with new and copy the string data over with strcpy, like this: return strcpy( new char[s.length()+1], s.c_str() ), where the +1 is to accomodate a terminating zero-byte.

C++ substring from string

I'm pretty new to C++ and I'm need to create MyString class, and its method to create new MyString object from another's substring, but chosen substring changes while class is being created and when I print it with my method.
Here is my code:
#include <iostream>
#include <cstring>
using namespace std;
class MyString {
public:
char* str;
MyString(char* str2create){
str = str2create;
}
MyString Substr(int index2start, int length) {
char substr[length];
int i = 0;
while(i < length) {
substr[i] = str[index2start + i];
i++;
}
cout<<substr<<endl; // prints normal string
return MyString(substr);
}
void Print() {
cout<<str<<endl;
}
};
int main() {
char str[] = {"hi, I'm a string"};
MyString myStr = MyString(str);
myStr.Print();
MyString myStr1 = myStr.Substr(10, 7);
cout<<myStr1.str<<endl;
cout<<"here is the substring I've done:"<<endl;
myStr1.Print();
return 0;
}
And here is the output:
hi, I'm a string
string
stri
here is the substring I've done:
♦
Have to walk this through to explain what's going wrong properly so bear with me.
int main() {
char str[] = {"hi, I'm a string"};
Allocated a temporary array of 17 characters (16 letters plus a the terminating null), placed the characters "hi, I'm a string" in it, and ended it off with a null. Temporary means what it sound like. When the function ends, str is gone. Anything pointing at str is now pointing at garbage. It may shamble on for a while and give some semblance of life before it is reused and overwritten, but really it's a zombie and can only be trusted to kill your program and eat its brains.
MyString myStr = MyString(str);
Creates myStr, another temporary variable. Called the constructor with the array of characters. So let's take a look at the constructor:
MyString(char* str2create){
str = str2create;
}
Take a pointer to a character, in this case it will have a pointer to the first element of main's str. This pointer will be assigned to MyString's str. There is no copying of the "hi, I'm a string". Both mains's str and MyString's strpoint to the same place in memory. This is a dangerous condition because alterations to one will affect the other. If one str goes away, so goes the other. If one str is overwritten, so too is the other.
What the constructor should do is:
MyString(char* str2create){
size_t len = strlen(str2create); //
str = new char[len+1]; // create appropriately sized buffer to hold string
// +1 to hold the null
strcpy(str, str2create); // copy source string to MyString
}
A few caveats: This is really really easy to break. Pass in a str2create that never ends, for example, and the strlen will go spinning off into unassigned memory and the results will be unpredictable.
For now we'll assume no one is being particularly malicious and will only enter good values, but this has been shown to be really bad assumption in the real world.
This also forces a requirement for a destructor to release the memory used by str
virtual ~MyString(){
delete[] str;
}
It also adds a requirement for copy and move constructors and copy and move assignment operators to avoid violating the Rule of Three/Five.
Back to OP's Code...
str and myStr point at the same place in memory, but this isn't bad yet. Because this program is a trivial one, it never becomes a problem. myStr and str both expire at the same point and neither are modified again.
myStr.Print();
Will print correctly because nothing has changed in str or myStr.
MyString myStr1 = myStr.Substr(10, 7);
Requires us to look at MyString::Substr to see what happens.
MyString Substr(int index2start, int length) {
char substr[length];
Creates a temporary character array of size length. First off, this is non-standard C++. It won't compile under a lot of compilers, do just don't do this in the first place. Second, it's temporary. When the function ends, this value is garbage. Don't take any pointers to substr because it won't be around long enough to use them. Third, no space was reserved for the terminating null. This string will be a buffer overrun waiting to happen.
int i = 0;
while(i < length) {
substr[i] = str[index2start + i];
i++;
}
OK that's pretty good. Copy from source to destination. What it is missing is the null termination so users of the char array knows when it ends.
cout<<substr<<endl; // prints normal string
And that buffer overrun waiting to happen? Just happened. Whups. You got unlucky because it looks like it worked rather than crashing and letting you know that it didn't. Must have been a null in memory at exactly the right place.
return MyString(substr);
And this created a new MyString that points to substr. Right before substr hit the end of the function and died. This new MyString points to garbage almost instantly.
}
What Substr should do:
MyString Substr(int index2start, int length)
{
std::unique_ptr<char[]> substr(new char[length + 1]);
// unique_ptr is probably paranoid overkill, but if something does go
// wrong, the array's destruction is virtually guaranteed
int i = 0;
while (i < length)
{
substr[i] = str[index2start + i];
i++;
}
substr[length] = '\0';// null terminate
cout<<substr.get()<<endl; // get() gets the array out of the unique_ptr
return MyString(substr.get()); // google "copy elision" for more information
// on this line.
}
Back in OP's code, with the return to the main function that which was substr starts to be reused and overwritten.
cout<<myStr1.str<<endl;
Prints myStr1.str and already we can see some of it has been reused and destroyed.
cout<<"here is the substring I've done:"<<endl;
myStr1.Print();
More death, more destruction, less string.
Things to not do in the future:
Sharing pointers where data should have been copied.
Pointers to temporary data.
Not null terminating strings.
Your function Substr returns the address of a local variable substr indirectly by storing a pointer to it in the return value MyString object. It's invalid to dereference a pointer to a local variable once it has gone out of scope.
I suggest you decide whether your class wraps an external string, or owns its own string data, in which case you will need to copy the input string to a member buffer.

Segmentation Fault in deleting char pointer

I'm posting two fragments here.
The first one is giving me Segmentation Fault on deallocating the memory. Second one is working fine.
1)
int main()
{
char* ch = new char;
ch = "hello";
cout << "\n " << ch << endl;
delete[] ch; ////OR delete ch; ---> have tried both
return 0;
}
2)
int main()
{
char* ch = new char;
cin >> ch;
cout << "\n " << ch << endl;
delete[] ch; ///OR delete ch /// Both are working fine....
return 0;
}
Could anybody please tell me why the first one is failing with Segmentation Fault and second one is working fine with both delete and delete[]. Because to me both the program seems to same.
new char generates exactly 1 character (not an array of 1 character, use new char[1] for that)
so delete[] doesn't apply
in the first example, you overwrite your pointer to your allocated 1 character with a pointer to the character string "hello" - deleting this string (as it is static memory) will result in sesgfault
Edit
int main()
{
char* ch = new char; // ch points to 1 character in dynamic memory
ch = "hello"; // overwrite ch with pointer to static memory "hello"
cout<<"\n "<<ch<<endl; // outputs the content of the static memory
delete[] ch; // tries to delete static memory
return 0;
}
There are issues with both examples:
char* ch = new char;`
ch = "hello";`
The new returns an address that points to dynamically allocated memory. You must save this return value so that delete can be issued later. The code above overwrites this value with "hello" (a string-literal). You now have lost the value, and thus can not call delete with the proper value.
The second example, even though you say "works fine" is still faulty.
char* ch = new char;`
delete[] ch; ///OR delete ch /// Both are working fine....`
The wrong form of delete is used. You allocated with new, so you must deallocate with delete, not delete[]. It works this way: new->delete, new[]->delete[].
Unlike most other languages, if you go against the rules of C++, corrupt memory, overwrite a buffer, etc., there is no guarantee that your program will crash, seg fault, etc. to let you know that you've done something wrong.
In this case, you're lucky that simple types such as char* are not affected by you using the wrong form of delete. But you cannot guarantee that this will always work if you change compilers, runtime settings, etc.
There are a couple of problems with each, namely that you're only allocating a single character when you're trying to allocate a character array.
In the first example, you're also allocating a single character and then subsequently reassign the pointer to a character array - ch = "hello" will not copy the string, just reassign the pointer. Your call to delete[] will then attempt to delete a string that is not heap allocated, hence the seg fault. And you're also leaking the char you allocated, too.
In the first one, you change the pointer to point to a string literal:
ch = "hello";
String literals are static arrays, so mustn't be deleted.
The second is wrong for at least two reasons:
you allocate a single character, not an array; a single character would be deleted with delete not delete[]
cin>>ch will (most likely) read more than one character, but you've only allocated space for one.
Both of these cause undefined behaviour, which might manifest itself as a visible error, or might appear to "work fine" - but could fail when you least expect it.
To allocate an array, use new char[SIZE]; but even then, you can't prevent the user from giving too much input and overflowing the buffer.
Unless you're teaching yourself how to juggle raw memory (which is a dark art, best avoided unless absolutely necessary), you should stick to high-level types that manage memory for you:
std::string string;
string = "hello";
std::cout << string << '\n';
std::cin >> string;
std::cout << string << '\n';
there are several errors in your programs.
In the first program you are not deleting something dynamically allocated but the statically allocated string "hello". Infact when you execute ch="hello" you are not copying the string in the wrongly allocated buffer "new char" ( this new just allocates one char, not what you are looking for ) but you makes the pointer ch to point to the start of the string "hello" located somewhere in the non writable memory ( normaly that string are pointed directly into the executable ). So the delete operation is trying to deallocate something that cannot be deallocate. So the first program culd be rewritten like:
int main()
{
const char* ch = "hello";
cout<<"\n "<<ch<<endl;
return 0;
}
or like
int main()
{
char* ch = new char[strlen("hello")+1];
strcpy( ch, "hello");
cout<<"\n "<<ch<<endl;
delete[] ch; // with square brackets, it's an array
return 0;
}
Here's what's wrong with both snippets:
First snippet:
char* ch = new char; ch = "hello";
It's not legal to assign a string literal to a non-const char pointer .
Also, you re-assign the pointer immediately after you call new. The original value returned by new is now lost forever and can not be free for the duration of the program. This is known as a memory leak.
delete[] ch;
You try to deallocate the string literal. This crashes your program. You are only allowed to delete pointers that you get from new and delete[] pointers that you get from new[]. Deleting anything else has undefined behaviour.
Second snippet:
cout<<"\n "<<ch<<endl;
ch points to a single character, not a zero terminated char array. Passing this pointer to cout has undefined behaviour. You should use cout << *ch; to print that single character or make sure that ch points to a character array that is zero terminated.
delete[] ch;
You allocated with new, you must deallocate with delete. Using delete[] here has undefined behaviour.
Both are working fine....
"working fine" is one possible outcome of undefined behaviour, just like a runtime error is.
To answer the question, neither snippet is correct. First one crashes because you got lucky, second one appears to work because you got unlucky.
Solution: Use std::string.
You should use something like:
char* ch = new char[6] ;
strcpy(ch,"hello") ;
...
delete[] ch ;

How can I get this += operator to work?

I'm making my own string class and everything is working fine except one this: I'm trying to overload the operator += operator so that I can do this:
string s1 = "Hello", s2 = " World";
s1 += s2;
So this is what I tried:
// In the string class,
string& operator +=(string const& other)
{
using std::strcpy;
unsigned length = size() + other.size() + 1; // this->size() member
char *temp = new char[length]; // function and a "+ 1" for the
// null byte '\0'
strcpy(temp, buffer);
strcpy(temp, other.data());
delete[] buffer;
buffer = new char[length];
strcpy(buffer, temp); // copy temp into buffer
return *this;
}
But in my program I get no output after printing when using the code in main shown above. I also get no errors (not even a runtime error). Why is this and how can I fix this implementation?
Note: I know I can use std::string but I want to learn how to do this myself.
A problem is here:
strcpy(temp, other.data());
You've already copied the first string into the buffer (in the previous line), but this then overwrites it with the other string's data. You want to instead append the other string's data to the buffer using strcat:
strcat(temp, other.data());
As Jerry points out, your other problem is that you're not correctly initialising your strings in the first place.
As an aside, if you're going to use strcpy, strcat, etc, you should really use the length-limited versions (strncpy, strncat) to avoid potential buffer overrun issues.
Just took a quick glance at the demo you posted in the comment under Mac's answer. This is your problem:
string(char const *str) : buffer(new char[strlen(str)]), len(strlen(str))
{}
// ...
string s1 = "Hello";
You're allocating the buffer in the constructor, but never copying the data into it. What happens if you just do std::cout << s1;?
Edit: By the way, I noticed at least two other problems:
You're not updating len in operator +=
Your copy constructor is making two strings point to the same buffer. This is bad, it will blow up when one tries to use it after the other delete[]s it.

C++ method parameter passed by reference - memory question

Assume,
void proc(CString& str)
{
str = "123";
}
void runningMethod()
{
CString str="ABC";
proc(str);
}
I understand that at the exit of runningMethod str will be deallocated automatically; in this case, how does C++ delete the old data ("ABC")?
Thanks,
Gil.
"ABC" was overwritten when you said = "123".
Internally, a string is an array of characters. At start, it made a new buffer that contained {'A', 'B', 'C', '\0'}. When you assigned, it just wrote '1' over the 'A', and so on.
When it destructed, it deleted the buffer.
The same happens as if you'd write:
CString foo = "ABC";
foo = "123";
The exact details depend on the implementation of CString, but the important bit is that you don't have to worry about allocation and deallocation now that the class takes care of it for you.
In most cases when you do your assignment in proc() "ABC" will be freed. This is usually done in overloaded operator method. For example here
you have example how such overload looks like.
String& String::operator= (const String& other)
{
char* otherdata = other.data;
char* olddata = data;
if (otherdata != 0)
{
data = new char[other.length+1];
length = other.length;
memcpy(data,otherdata,other.length+1);
}
else
{
data = 0;
length = 0;
}
if (olddata != 0)
{
delete[] olddata;
}
return *this;
}
A couple things to keep in mind here. First, the operator= of a class will generally take care of deleting anything it used to refer to before assigning the new data. Well, that's not entirely true, often times a smart developer will implement operator= by first creating a copy of the incoming class and then swapping current data with the new temporary, which now has ownership and deletes it. The important part to remember though is that before the operator= function exists the old data has, generally speaking, been discarded.
The other thing to keep in mind is that "ABC" is a string literal. The standard doesn't really define how they have to be stored, it simply states limitations that allow certain usual implementations. Very often that string literal will appear as a read-only element within the program data. In that case it will never be deleted so long as the program's image is loaded into memory (when it's running basically). This is the whole reason why code like this is UB:
void f()
{
char * x = "hello"; // points to a string literal.
x[0] = 'H';
}
// correct implementation is:
void f()
{
char x[] = "hello"; // reserved an array of 6 characters and copies string literal content.
x[0] = 'H';
}