C++ method parameter passed by reference - memory question - c++

Assume,
void proc(CString& str)
{
str = "123";
}
void runningMethod()
{
CString str="ABC";
proc(str);
}
I understand that at the exit of runningMethod str will be deallocated automatically; in this case, how does C++ delete the old data ("ABC")?
Thanks,
Gil.

"ABC" was overwritten when you said = "123".
Internally, a string is an array of characters. At start, it made a new buffer that contained {'A', 'B', 'C', '\0'}. When you assigned, it just wrote '1' over the 'A', and so on.
When it destructed, it deleted the buffer.

The same happens as if you'd write:
CString foo = "ABC";
foo = "123";

The exact details depend on the implementation of CString, but the important bit is that you don't have to worry about allocation and deallocation now that the class takes care of it for you.

In most cases when you do your assignment in proc() "ABC" will be freed. This is usually done in overloaded operator method. For example here
you have example how such overload looks like.
String& String::operator= (const String& other)
{
char* otherdata = other.data;
char* olddata = data;
if (otherdata != 0)
{
data = new char[other.length+1];
length = other.length;
memcpy(data,otherdata,other.length+1);
}
else
{
data = 0;
length = 0;
}
if (olddata != 0)
{
delete[] olddata;
}
return *this;
}

A couple things to keep in mind here. First, the operator= of a class will generally take care of deleting anything it used to refer to before assigning the new data. Well, that's not entirely true, often times a smart developer will implement operator= by first creating a copy of the incoming class and then swapping current data with the new temporary, which now has ownership and deletes it. The important part to remember though is that before the operator= function exists the old data has, generally speaking, been discarded.
The other thing to keep in mind is that "ABC" is a string literal. The standard doesn't really define how they have to be stored, it simply states limitations that allow certain usual implementations. Very often that string literal will appear as a read-only element within the program data. In that case it will never be deleted so long as the program's image is loaded into memory (when it's running basically). This is the whole reason why code like this is UB:
void f()
{
char * x = "hello"; // points to a string literal.
x[0] = 'H';
}
// correct implementation is:
void f()
{
char x[] = "hello"; // reserved an array of 6 characters and copies string literal content.
x[0] = 'H';
}

Related

How does the '->' operator work and is it a good implementation to modify a large string?

I want to begin with saying that I have worked with pointers before and I assumed I understood how they worked. As in,
int x = 5;
int *y = &x;
*y = 3;
std::cout << x; // Would output 3
But then I wanted to make a method which modifies a rather large string and I believe therefore it would be better to pass a reference to the string in order to avoid passing the entire string back and fourth. So I pass my string to myFunc() and I do the same thing as I did with the numbers above. Which means I can modify *str as I do in the code below. But in order to use methods for String I need to use the -> operator.
#include <iostream>
#include <string>
int myFunc(std::string *str) { // Retrieve the address to which str will point to.
*str = "String from myFunc"; // This is how I would normally change the value of myString
str->replace(0, 1, "s"); // Replacing index 0 with a lowercase s.
return 0;
}
int main() {
std::string myString << "String from main";
myFunc(&myString); // Pass address of myString to myFunc()
}
My questions are:
Since str in myFunc is an address, why can an address use an
operator such as -> and how does it work? Is it as simple as the
object at the address str's method is used? str->replace(); // str->myString.replace()?
Is this a good implementation of modifying a large string or would it better to pass the string to the method and return the string when its modified??
ptr->x is identical to (*ptr).x unless -> is overridden for a type you're dereferencing. On normal pointers, that works as you'd expect it to.
As for implementation, profile it when you implement it. You can't know what compiler will do with this once you turn optimizations on. For example, if given function gets inlined, you won't even have any extra indirection in the first place and it won't matter which way you do it. As long as you don't allocate a new string, differences should generally be negligible.
str is a pointer to std::string object. The arrow operator, ->, is used to dereference the pointer and then access its member. Alternatively, you can also write (*str).replace(0,1,"s"); here, * dereferences the pointer and then . access the member function replace().
Pointers are often confusing; it is better to use references when possible.
void myFunc(std::string &str) { // Retrieve the address to which str will point to.
str = "String from myFunc"; // This is how I would normally change the value of myString
str.replace(0, 1, "s"); // Replacing index 0 with a lowercase s.
}
int main() {
std::string myString = "String from main";
myFunc(myString); // Pass address of myString to myFunc()
}
Is this a good implementation of modifying a large string or would it better to pass the string to the method and return the string when its modified??
If you don't want to change the original string then create a new string and return it.
If it's ok for your application to modify the original string then do it. Also you can return a reference to a modified string if you need to chain function calls.
std::string& myFunc(std::string &str) { // Retrieve the address to which str will point to.
str = "String from myFunc"; // This is how I would normally change the value of myString
return str.replace(0, 1, "s"); // Replacing index 0 with a lowercase s.
}

C++ substring from string

I'm pretty new to C++ and I'm need to create MyString class, and its method to create new MyString object from another's substring, but chosen substring changes while class is being created and when I print it with my method.
Here is my code:
#include <iostream>
#include <cstring>
using namespace std;
class MyString {
public:
char* str;
MyString(char* str2create){
str = str2create;
}
MyString Substr(int index2start, int length) {
char substr[length];
int i = 0;
while(i < length) {
substr[i] = str[index2start + i];
i++;
}
cout<<substr<<endl; // prints normal string
return MyString(substr);
}
void Print() {
cout<<str<<endl;
}
};
int main() {
char str[] = {"hi, I'm a string"};
MyString myStr = MyString(str);
myStr.Print();
MyString myStr1 = myStr.Substr(10, 7);
cout<<myStr1.str<<endl;
cout<<"here is the substring I've done:"<<endl;
myStr1.Print();
return 0;
}
And here is the output:
hi, I'm a string
string
stri
here is the substring I've done:
♦
Have to walk this through to explain what's going wrong properly so bear with me.
int main() {
char str[] = {"hi, I'm a string"};
Allocated a temporary array of 17 characters (16 letters plus a the terminating null), placed the characters "hi, I'm a string" in it, and ended it off with a null. Temporary means what it sound like. When the function ends, str is gone. Anything pointing at str is now pointing at garbage. It may shamble on for a while and give some semblance of life before it is reused and overwritten, but really it's a zombie and can only be trusted to kill your program and eat its brains.
MyString myStr = MyString(str);
Creates myStr, another temporary variable. Called the constructor with the array of characters. So let's take a look at the constructor:
MyString(char* str2create){
str = str2create;
}
Take a pointer to a character, in this case it will have a pointer to the first element of main's str. This pointer will be assigned to MyString's str. There is no copying of the "hi, I'm a string". Both mains's str and MyString's strpoint to the same place in memory. This is a dangerous condition because alterations to one will affect the other. If one str goes away, so goes the other. If one str is overwritten, so too is the other.
What the constructor should do is:
MyString(char* str2create){
size_t len = strlen(str2create); //
str = new char[len+1]; // create appropriately sized buffer to hold string
// +1 to hold the null
strcpy(str, str2create); // copy source string to MyString
}
A few caveats: This is really really easy to break. Pass in a str2create that never ends, for example, and the strlen will go spinning off into unassigned memory and the results will be unpredictable.
For now we'll assume no one is being particularly malicious and will only enter good values, but this has been shown to be really bad assumption in the real world.
This also forces a requirement for a destructor to release the memory used by str
virtual ~MyString(){
delete[] str;
}
It also adds a requirement for copy and move constructors and copy and move assignment operators to avoid violating the Rule of Three/Five.
Back to OP's Code...
str and myStr point at the same place in memory, but this isn't bad yet. Because this program is a trivial one, it never becomes a problem. myStr and str both expire at the same point and neither are modified again.
myStr.Print();
Will print correctly because nothing has changed in str or myStr.
MyString myStr1 = myStr.Substr(10, 7);
Requires us to look at MyString::Substr to see what happens.
MyString Substr(int index2start, int length) {
char substr[length];
Creates a temporary character array of size length. First off, this is non-standard C++. It won't compile under a lot of compilers, do just don't do this in the first place. Second, it's temporary. When the function ends, this value is garbage. Don't take any pointers to substr because it won't be around long enough to use them. Third, no space was reserved for the terminating null. This string will be a buffer overrun waiting to happen.
int i = 0;
while(i < length) {
substr[i] = str[index2start + i];
i++;
}
OK that's pretty good. Copy from source to destination. What it is missing is the null termination so users of the char array knows when it ends.
cout<<substr<<endl; // prints normal string
And that buffer overrun waiting to happen? Just happened. Whups. You got unlucky because it looks like it worked rather than crashing and letting you know that it didn't. Must have been a null in memory at exactly the right place.
return MyString(substr);
And this created a new MyString that points to substr. Right before substr hit the end of the function and died. This new MyString points to garbage almost instantly.
}
What Substr should do:
MyString Substr(int index2start, int length)
{
std::unique_ptr<char[]> substr(new char[length + 1]);
// unique_ptr is probably paranoid overkill, but if something does go
// wrong, the array's destruction is virtually guaranteed
int i = 0;
while (i < length)
{
substr[i] = str[index2start + i];
i++;
}
substr[length] = '\0';// null terminate
cout<<substr.get()<<endl; // get() gets the array out of the unique_ptr
return MyString(substr.get()); // google "copy elision" for more information
// on this line.
}
Back in OP's code, with the return to the main function that which was substr starts to be reused and overwritten.
cout<<myStr1.str<<endl;
Prints myStr1.str and already we can see some of it has been reused and destroyed.
cout<<"here is the substring I've done:"<<endl;
myStr1.Print();
More death, more destruction, less string.
Things to not do in the future:
Sharing pointers where data should have been copied.
Pointers to temporary data.
Not null terminating strings.
Your function Substr returns the address of a local variable substr indirectly by storing a pointer to it in the return value MyString object. It's invalid to dereference a pointer to a local variable once it has gone out of scope.
I suggest you decide whether your class wraps an external string, or owns its own string data, in which case you will need to copy the input string to a member buffer.

Character pointer access

I wanted to access character pointer ith element. Below is the sample code
string a_value = "abcd";
char *char_p=const_cast<char *>(a_value.c_str());
if(char_p[2] == 'b') //Is this safe to use across all platform?
{
//do soemthing
}
Thanks in advance
Array accessors [] are allowed for pointer types, and result in defined and predictable behaviors if the offset inside [] refers to valid memory.
const char* ptr = str.c_str();
if (ptr[2] == '2') {
...
}
Is correct on all platforms if the length of str is 3 characters or more.
In general, if you are not mutating the char* you are looking at, it best to avoid a const_cast and work with a const char*. Also note that std::string provides operator[] which means that you do not need to call .c_str() on str to be able to index into it and look at a char. This will similarly be correct on all platforms if the length of str is 3 characters or more. If you do not know the length of the string in advance, use std::string::at(size_t pos), which performs bound checking and throws an out_of_range exception if the check fails.
You can access the ith element in a std::string using its operator[]() like this:
std::string a_value = "abcd";
if (a_value[2] == 'b')
{
// do stuff
}
If you use a C++11 conformant std::string implementation you can also use:
std::string a_value = "abcd";
char const * p = &a_value[0];
// or char const * p = a_value.data();
// or char const * p = a_value.c_str();
// or char * p = &a_value[0];
21.4.1/5
The char-like objects in a basic_string object shall be stored contiguously.
21.4.7.1/1: c_str() / data()
Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].
The question is essentially about querying characters in a string safely.
const char* a = a_value.c_str();
is safe unless some other operation modifies the string after it. If you can guarantee that no other code performs a modification prior to using a, then you have safely retrieved a pointer to a null-terminated string of characters.
char* a = const_cast<char *>(a_value.c_str());
is never safe. You have yielded a pointer to memory that is writeable. However, that memory was never designed to be written to. There is no guarantee that writing to that memory will actually modify the string (and actually no guarantee that it won't cause a core dump). It's undefined behaviour - absolutely unsafe.
reference here: http://en.cppreference.com/w/cpp/string/basic_string/c_str
addressing a[2] is safe provided you can prove that all possible code paths ensure that a represents a pointer to memory longer than 2 chars.
If you want safety, use either:
auto ch = a_string.at(2); // will throw an exception if a_string is too short.
or
if (a_string.length() > 2) {
auto ch = a_string[2];
}
else {
// do something else
}
Everyone explained very well for most how it's safe, but i'd like to extend a bit if that's ok.
Since you're in C++, and you're using a string, you can simply do the following to access a caracter (and you won't have any trouble, and you still won't have to deal with cstrings in cpp :
std::string a_value = "abcd";
std::cout << a_value.at(2);
Which is in my opinion a better option rather than going out of the way.
string::at will return a char & or a const char& depending on your string object. (In this case, a const char &)
In this case you can treat char* as an array of chars (C-string). Parenthesis is allowed.

Resizing char array not always working

In a custom string class called Str I have a function c_str() that just returns the private member char* data as const char* c_str() const { return data; }. This works when called after I create a new Str but if I then overwrite the Str using cin, calling c_str() on it only sometimes works, but always works if I cin a bigger Str than the original.
Str b("this is b");
cout << b.c_str() << endl;
cin >> b;
cout << b.c_str() << endl;
Here the first b.c_str() works but if I attempt to change Str b to just 'b' on the cin >> b; line then it outputs 'b' + a bit of garbage. But if I try to change it to 'bb' it usually works, and if I change it to something longer than "this is b", it always works.
This is odd because my istream operator (which is friended) completely deallocates the Str and ends up allocating a new char array only 1 char larger for each char it reads in (just to see if it would work, it doesn't). So it seems like returning the array after reading in something else would return the new array that data is set it.
Relevant functions:
istream& operator>>(istream& is, Str& s) {
delete[] s.data;
s.data = nullptr;
s.length = s.limit = 0;
char c;
while (is.get(c) && isspace(c)) ;
if (is) {
do s.push_back(c);
while (is.get(c) && !isspace(c));
if (is)
is.unget();
}
return is;
}
void Str::push_back(char c) {
if (length == limit) {
++limit;
char* newData = new char[limit];
for (size_type i = 0; i != length; ++i)
newData[i] = data[i];
delete[] data;
data = newData;
}
data[length++] = c;
}
With push_back() like this, the array never has a capacity larger than what it holds, so I don't see how my c_str() could output any memory garbage.
Based on the push_back() in the question and the c_str() in the comment, there is no guarantee that the C-string returned from c_str() is null-terminated. Since a char const* doesn't know the length of the string without the null-terminator this is the source of the problem!
When allocating small memory objects you probably get back one of the small memory object previously used by you string class and that contains non-null characters, causing the printed character appear as if it is of what is the length to first null byte found. When allocating bigger chunks you seem to get back "fresh" memory which still contains null character, making the situation appear as if all is OK.
There are basically two ways to fix this problem:
Add a null-terminator before returning a char const* from c_str(). If you don't care multi-threading for now, this can be done in the c_str() function. In contexts where multi-threading matters it is probably a bad idea to make any mutations in const member functions as these would introduce data races. Thus, the C++ standard string classes add the null-terminator in one of the mutating operations.
Do not support a c_str() function at all but rather implement an output operator for your string class. This way, no null-termination is needed.

C++ new & delete and string & functions

Okay the previous question was answered clearly, but i found out another problem.
What if I do:
char *test(int ran){
char *ret = new char[ran];
// process...
return ret;
}
And then run it:
for(int i = 0; i < 100000000; i++){
string str = test(rand()%10000000+10000000);
// process...
// no need to delete str anymore? string destructor does it for me here?
}
So after converting the char* to string, I don't have to worry about the deleting anymore?
Edit: As answered, I have to delete[] each new[] call, but on my case its not possible since the pointer got lost, so the question is: how do I convert char to string properly?
Here you are not converting the char* to a [std::]string, but copying the char* to a [std::]string.
As a rule of thumb, for every new there should be a delete.
In this case, you'll need to store a copy of the pointer and delete it when you're done:
char* temp = test(rand()%10000000+10000000);
string str = temp;
delete[] temp;
You seem to be under the impresison that passing a char* into std::string transfers ownership of the allocated memory. In fact it just makes a copy.
The easiest way to solve this is to just use a std::string throughout the entire function and return it directly.
std::string test(int ran){
std::string ret;
ret.resize(ran - 1); // If accessing by individual character, or not if using the entire string at once.
// process... (omit adding the null terminator)
return ret;
}
Yes, yes you do.
If you are using linux/os x, look into something like valgrind which can help you with memory issues
You can change your test function so that it returns a string instead of char *, this way you can delete [] ret in the test function.
OR you could just use a string in test as well and not have to worry about new/delete.
You must call delete for every new otherwise you will leak memory. In the case you have shown you are throwing away the pointer, if you must leave the function as returning a char* then you will need to use two lines to create the std::string so you can retain a copy of the char* to delete.
A better solution would be to rewrite your test() function to return a std::string directly.
You need to do something like this:
for(int i = 0; i < 100000000; i++){
int length = rand()%10000000+10000000;
char* tmp = test(length);
string str(tmp);
delete[length] tmp;
}
This deletes the allocated char-array properly.
By the way, you should always zero-terminate a string if you create it this way (i.e. inside the function test), otherwise some functions can easily get "confused" and treat data behind your string as part of it, which in the best case crashes your application, and in the worst case creating a silent buffer overflow leading to undefined behaviour at a later point, which is the ultimate debugging nightmare... ;)