If I have the following code:
{
UnicodeString sFish = L"FISH";
char *szFish = AnsiString(sFish).c_str();
CallFunc(szFish);
}
Then what is the scope of the temporary AnsiString that's created, and for how long is szFish pointing to valid data? Will it still be valid for the CallFunc function?
Will it's scope last just the one line, or for the whole block?
szFish is invalid before the call to CallFunc(), because AnsiString is a temporary object that is destructed immediately and szFish is pointing to its internal buffer which will have just been deleted.
Ensure that the AnsiString instance is valid for the invocation of CallFunc(). For example:
CallFunc(AnsiString(sFish).c_str());
I would replace:
char *szFish = AnsiString(sFish).c_str();
with:
AnsiString as(sFish);
char *szFish = as.c_str();
I don't know the AnsiString class but in your code its destructor will fire before your call to CallFunc(), and will most probably release the string you point to with *szFish. When you replace the temporary object with a "named" object on stack its lifetime will extend until the end of the block it is defined in.
The C++11 standard $12.2.3 says:
When an implementation introduces a temporary object of a class that
has a non-trivial constructor (12.1, 12.8), it shall ensure that a
constructor is called for the temporary object. Similarly, the
destructor shall be called for a temporary with a non-trivial
destructor (12.4). Temporary objects are destroyed as the last step in
evaluating the full-expression (1.9) that (lexically) contains the
point where they were created. This is true even if that evaluation
ends in throwing an exception. The value computations and side effects
of destroying a temporary object are associated only with the
full-expression, not with any specific subexpression.
(emphasis mine)
There are additional caveats to this, but they don't apply in this situation. In your case the full expression is the indicated part of this statement:
char *szFish = AnsiString(sFish).c_str();
// ^^^^^^^^^^^^^^^^^^^^^^^^^
So, the instant szFish is assigned, the destructor of your temporary object (i.e. AnsiString(sFish)) will be called and its internal memory representation (where c_str() points to) will be released. Thus, szFish will be immediately become a dangling pointer and any access will fail.
You can get around this by saying
CallFunc(AnsiString(sFish).c_str());
instead, as here, the temporary will be destroyed (again) after the full expression (that is, right at the ;) and CallFunc will be able to read the raw string.
The scope of the AnsiString in this case is "from right before the call to c_str(), until right after."
It may help to think of it this way:
char *szFish;
{
AnsiString tmpString(sFish);
szFish = tmpString.c_str();
}
Related
As per the documentation(), which says:
When an implementation introduces a temporary object of a class that
has a non-trivial constructor ([class.default.ctor],
[class.copy.ctor]), it shall ensure that a constructor is called for
the temporary object. Similarly, the destructor shall be called for a
temporary with a non-trivial destructor ([class.dtor]). Temporary
objects are destroyed as the last step in evaluating the
full-expression ([intro.execution]) that (lexically) contains the
point where they were created. This is true even if that evaluation
ends in throwing an exception. The value computations and side effects
of destroying a temporary object are associated only with the
full-expression, not with any specific subexpression.
How to comprehend "Temporary objects are destroyed as the last step in evaluating the full-expression ([intro.execution]) that (lexically) contains the point where they were created."?Could anyboday make it clear by some simple examples?
Simple example. This expression produces a temporary object:
std::string("test")
Here, that expression is used as a subexpression:
function(std::string("test"));
// point A
At point A, the temporary object has been destroyed because the point is after the full-expression where the temporary object was created.
Here is an example of how to write a bug if this rule is not understood:
const std::string& function(const std::string& arg) {
return arg;
}
const std::string& ref = function("test");
std::cout << ref;
Here, the temporary object that was created as the argument is destroyed after the full expression, and therefore ref has become invalid - a dangling reference. The behaviour is undefined when the invalid reference is inserted into the output stream.
An explanation that works in many cases is that temporary objects are destroyed when execution reaches the semicolon at the end of the statement. Some language constructs (such as a for loop) are not covered by this explanation, so don't push it too hard. For a better explanation of the exceptions, see Statement that isn't a full-expression.
As one example:
i = foo1(foo2(std::string("test")));
The temporary string is kept alive until after the assignment, as the assignment occurs before the end of the statement. (In this case, the full expression is the statement.)
I've just been thinking about the following bit of code:
PerformConflict(m_dwSession,
CONFLICT_DETECTED,
item.GetConflictedFile().GetUnNormalizedPath().c_str(),
item.GetSuggestedFile().GetUnNormalizedPath().c_str());
GetConflictFile() returns an object.
GetUnNormalizedPath()
returns a std::wstring
c_str() just returns a const wchar_t* (in this case to the contents of an rvalue std::wstring)
My question is: Does anything in the spec guarantee that this code is safe? I.e. are all the rvalue objects guaranteed not to have been destroyed by the time that c_str() is getting a pointer to their contents?
Those temporaries will be destroyed at the end of the full expression they appear in. In your case, that's the entire snippet you posted.
This will be absolutely fine, so long as you only use that const wchar_t* inside that function invocation. If you store it anywhere and try to access it after the call exits, you would be thrust down the deep dark hole of UB.
The relevant standards quote is (emphasis mine):
N3337 [class.temporary]/3:
When an implementation introduces a temporary object of a class that has a non-trivial constructor (12.1,
12.8), it shall ensure that a constructor is called for the temporary object. Similarly, the destructor shall be
called for a temporary with a non-trivial destructor (12.4). Temporary objects are destroyed as the last step
in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. This is true
even if that evaluation ends in throwing an exception. The value computations and side ef f ects of destroying
a temporary object are associated only with the full-expression, not with any specific subexpression.
As illustrated by Herb Sutter, rvalues are destroyed at the end of the expression in which they appear. However, if you bind them to "a reference to const on the stack", their lifetime is extended to that of the reference.
So, basically, if your function has this kind of signature:
PerformConflict(...,
...,
const std::string& str1, //< any rvalue passed here will have the same lifetime as str1
const std::string& str2 //< any rvalue passed here will have the same lifetime as str2
);
You should be able to manipulate the strings inside PerformConflict() without problems.
PS: the problem can also be solved if you pass the arguments by value (i.e. const std::string str1)
I wonder if the void func(const char *str); refer to a valid str if I wrote as follow:
auto str = string("hello").c_str();
func(str);
How is it different from code below?
func(string("hello").c_str())
In both cases, the string object is a temporary, destroyed at the end of the statement.
In the first case, str ends up dangling - pointing to memory that was managed by the temporary string, but which has now been destroyed. Doing anything with it is an error, giving undefined behaviour.
In the second case, the temporary string is not destroyed until after the function returns. So this is fine, as long as the function doesn't keep hold of the pointer for something else to use later.
The difference is that the first creates a temporary string object that gets destroyed at the end of the first statement, so str becomes a dangling pointer. The second also creates a temporary, but it exists throughout the call to func because the temporary object doesn't get destroyed until after the call to func returns.
From Paragraph 12.2/3 of the C++11 Standard:
[...] Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. This is true even if that evaluation ends in throwing an exception. [...]
This means that the temporary created within the expression that contains the call to func() will live until function call returns.
On the other hand, the lifetime of the temporary in the first code snippet will end up before func() is invoked, and str will be dangling. This will result in Undefined Behavior.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Lifetime of temporaries
int LegacyFunction(const char *s) {
// do something with s, like print it to standard output
// this function does NOT retain any pointer to s after it returns.
return strlen(s);
}
std::string ModernFunction() {
// do something that returns a string
return "Hello";
}
LegacyFunction(ModernFunction().c_str());
The above example could easily be rewritten to use smart pointers instead of strings; I've encountered both of these situations many times. Anyway, the above example will construct an STL string in ModernFunction, return it, then get a pointer to a C-style string inside of the string object, and then pass that pointer to the legacy function.
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction? (Remember that the string object is managing the memory that c_str() return value points to...)
If the above code is not safe, why is it not safe, and is there a better, equally concise way to write it than adding a temporary variable when making the function calls? If it's safe, why?
LegacyFunction(ModernFunction().c_str());
Destruction of copy will be after evaluation of full expression (i.e. after return from LegacyFunction).
n3337 12.2/3
Temporary objects are destroyed as the last step
in evaluating the full-expression (1.9) that (lexically) contains the point where they were created.
n3337 1.9/10
A full-expression is an expression that is not a subexpression of another expression. If a language construct
is defined to produce an implicit call of a function, a use of the language construct is considered to be an
expression for the purposes of this definition. A call to a destructor generated at the end of the lifetime of
an object other than a temporary object is an implicit full-expression. Conversions applied to the result of
an expression in order to satisfy the requirements of the language construct in which the expression appears
are also considered to be part of the full-expression.
[ Example:
struct S {
S(int i): I(i) { }
int& v() { return I; }
private:
int I;
};
S s1(1); // full-expression is call of S::S(int)
S s2 = 2; // full-expression is call of S::S(int)
void f() {
if (S(3).v()) // full-expression includes lvalue-to-rvalue and
// int to bool conversions, performed before
// temporary is deleted at end of full-expression
{ }
}
There is a temporary string object that exists after ModernFunction has returned. When does it go out of scope?
Strictly speaking, it's never in scope. Scope is a property of a name, not an object. It just so happens that automatic variables have a very close association between scope and lifetime. Objects that aren't automatic variables are different.
Temporary objects are destroyed at the end of the full-expression in which they appear, with a couple of exceptions that aren't relevant here. Anyway the special cases extend the lifetime of the temporary, they don't reduce it.
Is it possible for the compiler to call c_str(), destruct this temporary string object, and then pass a dangling pointer to LegacyFunction
No, because the full-expression is LegacyFunction(ModernFunction().c_str()) (excluding the semi-colon: feel that pedantry), so the temporary that is the return value of ModernFunction is not destroyed until LegacyFunction has returned.
If it's safe, why?
Because the lifetime of the temporary is long enough.
In general with c_str, you have to worry about two things. First, the pointer it returns becomes invalid if the string is destroyed (which is what you're asking). Second, the pointer it returns becomes invalid if the string is modified. You haven't worried about that here, but it's OK, you don't need to, because nothing modifies the string either.
Consider this code (for different values of renew and cleanse):
struct T {
int mem;
T() { }
~T() { mem = 42; }
};
// identity functions,
// but breaks any connexion between input and output
int &cleanse_ref(int &r) {
int *volatile pv = &r; // could also use cin/cout here
return *pv;
}
void foo () {
T t;
int &ref = t.mem;
int &ref2 = cleanse ? cleanse_ref(ref) : ref;
t.~T();
if (renew)
new (&t) T;
assert(ref2 == 42);
exit(0);
}
Is the assert guaranteed to pass?
I understand that this style is not recommended. Opinions like "this is not a sound practice" are not of interest here.
I want an answer showing a complete logical proof from standard quotes. The opinion of compiler writers might also be interesting.
EDIT: now with two questions in one! See the renew parameter (with renew == 0, this is the original question).
EDIT 2: I guess my question really is: what is a member object?
EDIT 3: now with another cleanse parameter!
I first had these two quotes, but now I think they actually just specify that things like int &ref = t.mem; must happen during the lifetime of t. Which it does, in your example.
12.7 paragraph 1:
For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.
And paragraph 3:
To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.
We have here a complete object of type T and a member subobject of type int.
3.8 paragraph 1:
The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
By the way, 3.7.3 p1:
The storage for these [automatic storage duration] entities lasts until the block in which they are created exits.
And 3.7.5:
The storage duration of member subobjects, base class subobjects and array elements is that of their complete object (1.8).
So no worries about the compiler "releasing" the storage before the exit in this example.
A non-normative note in 3.8p2 mentions that "12.6.2 describes the lifetime of base and member subobjects," but the language there only talks about initialization and destructors, not "storage" or "lifetime", so I conclude that section does not affect the definition of "lifetime" for subobjects of trivial type.
If I'm interpreting all this right, when renew is false, the lifetime of the complete class object ends at the end of the explicit destructor call, BUT the lifetime of the int subobject continues to the end of the program.
3.8 paragraphs 5 and 6 say that pointers and references to "allocated storage" before or after any object's lifetime can be used in limited ways, and list a whole lot of things you may not do with them. Lvalue-to-rvalue conversion, like the expression ref == 42 requires, is one of those things, but that's not an issue if the lifetime of the int has not yet ended.
So I think with renew false, the program is well-formed and the assert succeeds!
With renew true, the storage is "reused" by the program, so the lifetime of the original int is over, and the lifetime of another int begins. But then we get into 3.8 paragraph 7:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied, and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object (1.8) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
The first bullet point here is the trickiest one. For a standard-layout class like your T, the same member certainly must always be in the same storage. I'm not certain whether or not this is technically required when the type is not standard-layout.
Although whether ref may still be used or not, there's another issue in this example.
12.6.2 paragraph 8:
After the call to a constructor for class X has completed, if a member of X is neither initialized nor given a value during execution of the compound-statement of the body of the constructor, the member has indeterminate value.
Meaning the implementation is compliant if it sets t.mem to zero or 0xDEADBEEF (and sometimes debug modes will actually do such things before calling a constructor).
You have not destroyed memory, you only manually called destructor (in this context it's not different then calling normal method). Memory (stack part) of your t variable was not 'released'. So this assert will always pass with your current code.