Is passing a reference through function safe? - c++

Is the following function safe in C++03 or C++11 or does it exhibit UB?
string const &min(string const &a, string const &b) {
return a < b ? a : b;
}
int main() {
cout << min("A", "B");
}
Is it OK to return a reference to an object passed to the function by
reference?
Is it guaranteed that the temporary string object is
not destroyed too soon?
Is there any chance that the given function
min could exhibit UB (if it does not in the given context)?
Is it possible to make an equivalent, but safe function while still avoiding
copying or moving?

Is it OK to return a reference to an object passed to the function by reference?
As long as the object isn't destroyed before you access it via that reference, yes.
Is it guaranteed that the temporary string object is not destroyed too soon?
In this case, yes. A temporary lasts until the end of the full expression which creates it, so it is not destroyed until after being streamed to cout.
Is there any chance that the given function min could exhibit UB (if it does not in the given context)?
Yes, here is an example:
auto const & r = min("A", "B"); // r is a reference to one of the temporaries
cout << r; // Whoops! Both temporaries have been destroyed
Is it possible to make an equivalent, but safe function while still avoiding copying or moving?
I don't think so; but this function is safe as long as you don't keep hold of a reference to its result.

Your temporary objects will stay "alive" till the end of the ; from the cout in main, so this way of using it is safe.

Yes, it's safe. The string temps for both "A" and "B" will survive until the end of the 'sequence point', that is the semicolon.

Is it guaranteed that the temporary string object is not destroyed too
soon
For your specific case yes, BUT for the following code no
int main() {
const string &tempString(min("A", "B"));
cout << tempString;
}
Beside that I agree with what "Mike Seymour" said.

yes it is safe to pass reference through function because in the changes always made in the arguments which you pass by value and in the above code string temps for both A and B will survive until the end of the 'sequence point' that is the semicolon but in case of pass by reference the changes made in copy of that argument not in a original copy.

Related

Is it safe to pass an `std::string` temporary into an `std::string_view` parameter?

Suppose I have the following code:
void some_function(std::string_view view) {
std::cout << view << '\n';
}
int main() {
some_function(std::string{"hello, world"}); // ???
}
Will view inside some_function be referring to a string which has been destroyed? I'm confused because, considering this code:
std::string_view view(std::string{"hello, world"});
Produces the warning (from clang++):
warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl]
What's the difference?
(Strangely enough, using braces {} rather than brackets () to initialise the string_view above eliminates the warning. I've no idea why that is either.)
To be clear, I understand the above warning (the string_view outlives the string, so it holds a dangling pointer). What I'm asking is why passing a string into some_function doesn't produce the same warning.
std::string_view is nothing other than std::basic_string_view<char>, so let's see it's documentation on cppreference:
The class template basic_string_view describes an object that can refer to a constant contiguous sequence of char-like objects with the first element of the sequence at position zero.
A typical implementation holds only two members: a pointer to constant CharT and a size.
The part I have highlighted tells us why clang is right about std::string_view view(std::string{"hello, world"});: as others have commented it's because after the declaration is done, std::string{"hello, world"} is destroyed and that underlying pointer that the std::string_view holds dangles.
Clearly that's just a typical implementation, but since we know it is correct, it tells us at least that the standard doesn't require any implmentation to do something special to keep temporaries alive.
some_function(std::string{"hello, world"}); is completely safe, as long as the function doesn't preserve the string_view for later use.
The temporary std::string is destroyed at the end of this full-expression (roughly speaking, at this ;), so it's destroyed after the function returns.
std::string_view view(std::string{"hello, world"}); always produces a dangling string_view, regardless of whether you use () or {}. If the choice of brackets affects compiler warnings, it's a compiler defect.
Is it safe to pass an std::string temporary into an std::string_view parameter?
In general, it isn't necessarily safe. It depends on what the function does. If you don't know, then you shouldn't assume it to be safe.
Knowing the definition of the function as shown, it is safe to call the example function with a temporary string.
Will view inside some_function be referring to a string which has been destroyed?
Not in this case, because the temporary argument string - which the string view refers to - hasn't been destroyed.
What's the difference?
The parameter of the function has shorter lifetime than the lifetime of the temporary passed as the argument. The lifetime of the string view variable is longer than the lifetime of the temporary argument passed to the constructor.
Just as others have said, some_function(std::string{"hello, world"}); is totally safe since it passes it by value and stays in scope until the function ends. If safety is all you are concerned with, that will do, if performance could be an issue, I'll recommend using an rvalue reference here like so:
void some_function(std::string_view&& view)
{
std::cout << "rval reference: " << view << '\n';
}
int main()
{
some_function(std::string{"hello, world"});
}
R-value references are great if you are going to use some_function() mainly for temporary values.

C++ extending lifetime of &&

In the following example:
http://coliru.stacked-crooked.com/a/7a1df22bb73f6030
struct D{
int i;
auto test2(int&& j){
return [&](){ // captured by reference!
cout << i*(j);
};
}
};
int main()
{
D d{10};
{
auto fn = d.test2(10);
fn(); // 1. wrong result here
d.test2(10)(); // 2. but ok here
}
}
Why does d.test2(10)(); work?
Should it really work, or thats just my undefined behavior equals correct result?
P.S. After reading this I see only one explanation: in (2) temporary lifetime prolongs till the end of the expression, and call happens in the same expression with && crteation; while (1) actually consists from 2 expressions:
a temporary bound to a reference parameter in a function call exists
until the end of the full expression containing that function call: if
the function returns a reference, which outlives the full expression,
it becomes a dangling reference.
Is this the case?
A temporary object lasts until the end of the line (well, full expression) where it is created, unless the lifetime is extended.
Your code does not extend the lifetimes of any temporaries. Lifetime extension through binding to references does not "commute", only the first binding extends lifetime.
So the furst case is UB as you have a dangling reference. The referred to temporary goes away st the end of the line: on the next line uou follow the reference, and chaos hapens.
In the second case, your reference does not extend the lifetime of the temporary, but the temporary lasts longer than the reference that binds to it does! They both die at the end of the line, in reverse order of construction.
So the call works.
Should it really work, or thats just my undefined behavior equals correct result?
Seems like it. In the example you linked, you have these warnings:
warning: '<anonymous>' is used uninitialized in this function [-Wuninitialized]
Uninitialized objects have indetermine values, and trying to access those values results in undefined behavior.

Why is my string reference member variable set to an empty string in C++?

Consider the following code:
class Foo
{
private:
const string& _bar;
public:
Foo(const string& bar)
: _bar(bar) { }
const string& GetBar() { return _bar; }
};
int main()
{
Foo foo1("Hey");
cout << foo1.GetBar() << endl;
string barString = "You";
Foo foo2(barString);
cout << foo2.GetBar() << endl;
}
When I execute this code (in VS 2013), the foo1 instance has an empty string in its _bar member variable while foo2's corresponding member variable holds the reference to value "You". Why is that?
Update: I'm of course using the std::string class in this example.
For Foo foo1("Hey") the compiler has to perform a conversion from const char[4] to std::string. It creates a prvalue of type std::string. This line is equivalent to:
Foo foo1(std::string("Hey"));
A reference bind occurs from the prvalue to bar, and then another reference bind occurs from bar to Foo::_bar. The problem here is that std::string("Hey") is a temporary that is destroyed when the full expression in which it appears ends. That is, after the semicolon, std::string("Hey") will not exist.
This causes a dangling reference because you now have Foo::_bar referring to an instance that has already been destroyed. When you print the string you then incur undefined behavior for using a dangling reference.
The line Foo foo2(barString) is fine because barString exists after the initialization of foo2, so Foo::_bar still refers to a valid instance of std::string. A temporary is not created because the type of the initializer matches the type of the reference.
You are taking a reference to an object that is getting destroyed at the end of the line with foo1. In foo2 the barString object still exist so the reference remains valid.
Yeah, this is the wonders of C++ and understanding:
The lifetime of objects
That string is a class and literal char arrays are not "strings".
What happens with implicit constructors.
In any case, string is a class, "Hey" is actually just an array of characters. So when you construct Foo with "Hey" which wants a reference to a string, it performs what is called an implicit conversion. This happens because string has an implicit constructor from arrays of characters.
Now for the lifetime of object issue. Having constructed this string for you, where does it live and what is its lifetime. Well actually for the value of that call, here the constructor of Foo, and anything it calls. So it can call all sorts of functions all over and that string is valid.
However once that call is over, the object expires. Unfortunately you have stored within your class a const reference to it, and you are allowed to. The compiler doesn't complain, because you may store a const reference to an object that is going to live longer.
Unfortunately this is a nasty trap. And I recall once I purposely gave my constructor, that really wanted a const reference, a non-const reference on purpose to ensure exactly that this situation did not occur (nor would it receive a temporary). Possibly not the best workaround, but it worked at the time.
Your best option really most of the time is just to copy the string. It is less expensive than you think unless you really process lots and lots of these. In your case it probably won't actually copy anything, and the compiler will secretly move the copy it made anyway.
You can also take a non-const reference to a string and "swap" it in
With C++11 there is a further option of using move semantics, which means the string passed in will become "acquired", itself invalidated. This is particularly useful when you do want to take in temporaries, which yours is an example of (although mostly temporaries are constructed through an explicit constructor or a return value).
The problem is that in this code:
Foo foo1("Hey");
From the string literal "Hey" (raw char array, more precisely const char [4], considering the three characters in Hey and the terminating \0) a temporary std::string instance is created, and it is passed to the Foo(const string&) constructor.
This constructor saves a reference to this temporary string into the const string& _bar data member:
Foo(const string& bar)
: _bar(bar) { }
Now, the problem is that you are saving a reference to a temporary string. So when the temporary string "evaporates" (after the constructor call statement), the reference becomes dangling, i.e. it references ("points to...") some garbage.
So, you incur in undefined behavior (for example, compiling your code using MinGW on Windows with g++, I have a different result).
Instead, in this second case:
string barString = "You";
Foo foo2(barString);
your foo2::_bar reference is associated to ("points to") the barString, which is not temporary, but is a local variable in main(). So, after the constructor call, the barString is still there when you print the string using cout << foo2.GetBar().
Of course, to fix that, you should consider using a std::string data member, instead of a reference.
In this way, the string will be deep-copied into the data member, and it will persist even if the input source string used in the constructor is a temporary (and "evaporates" after the constructor call).

Lambdas and capture by reference local variables : Accessing after the scope

I am passing my local-variables by reference to two lambda. I call these lambdas outside of the function scope. Is this undefined ?
std::pair<std::function<int()>, std::function<int()>> addSome() {
int a = 0, b = 0;
return std::make_pair([&a,&b] {
++a; ++b;
return a+b;
}, [&a, &b] {
return a;
});
}
int main() {
auto f = addSome();
std::cout << f.first() << " " << f.second();
return 0;
}
If it is not, however, changes in one lambda are not reflected in other lambda.
Am i misunderstanding pass-by-reference in context of lambdas ?
I am writing to the variables and it seems to be working fine with no runtime-errors with output
2 0. If it works then i would expect output 2 1.
Yes, this causes undefined behavior. The lambdas will reference stack-allocated objects that have gone out of scope. (Technically, as I understand it, the behavior is defined until the lambdas access a and/or b. If you never invoke the returned lambdas then there is no UB.)
This is undefined behavior the same way that it's undefined behavior to return a reference to a stack-allocated local and then use that reference after the local goes out of scope, except that in this case it's being obfuscated a bit by the lambda.
Further, note that the order in which the lambdas are invoked is unspecified -- the compiler is free to invoke f.second() before f.first() because both are part of the same full-expression. Therefore, even if we fix the undefined behavior caused by using references to destroyed objects, both 2 0 and 2 1 are still valid outputs from this program, and which you get depends on the order in which your compiler decides to execute the lambdas. Note that this is not undefined behavior, because the compiler can't do anything at all, rather it simply has some freedom in deciding the order in which to do some things.
(Keep in mind that << in your main() function is invoking a custom operator<< function, and the order in which function arguments are evaluated is unspecified. Compilers are free to emit code that evaluates all of the function arguments within the same full-expression in any order, with the constraint that all arguments to a function must be evaluated before that function is invoked.)
To fix the first problem, use std::shared_ptr to create a reference-counted object. Capture this shared pointer by value, and the lambdas will keep the pointed-to object alive as long as they (and any copies thereof) exist. This heap-allocated object is where we will store the shared state of a and b.
To fix the second problem, evaluate each lambda in a separate statement.
Here is your code rewritten with the undefined behavior fixed, and with f.first() guaranteed to be invoked before f.second():
std::pair<std::function<int()>, std::function<int()>> addSome() {
// We store the "a" and "b" ints instead in a shared_ptr containing a pair.
auto numbers = std::make_shared<std::pair<int, int>>(0, 0);
// a becomes numbers->first
// b becomes numbers->second
// And we capture the shared_ptr by value.
return std::make_pair(
[numbers] {
++numbers->first;
++numbers->second;
return numbers->first + numbers->second;
},
[numbers] {
return numbers->first;
}
);
}
int main() {
auto f = addSome();
// We break apart the output into two statements to guarantee that f.first()
// is evaluated prior to f.second().
std::cout << f.first();
std::cout << " " << f.second();
return 0;
}
(See it run.)
Unfortunately C++ lambdas can capture by reference but don't solve the "upwards funarg problem".
Doing so would require allocating captured locals in "cells" and garbage collection or reference counting for deallocation. C++ is not doing it and unfortunately this make C++ lambdas a lot less useful and more dangerous than in other languages like Lisp, Python or Javascript.
More specifically in my experience you should avoid at all costs implicit capture by reference (i.e. using the [&](…){…} form) for lambda objects that survive the local scope because that's a recipe for random segfaults later during maintenance.
Always plan carefully about what to capture and how and about the lifetime of captured references.
Of course it's safe to capture everything by reference with [&] if all you are doing is simply using the lambda in the same scope to pass code for example to algorithms like std::sort without having to define a named comparator function outside of the function or as locally used utility functions (I find this use very readable and nice because you can get a lot of context implicitly and there is no need to 1. make up a global name for something that will never be reused anywhere else, 2. pass a lot of context or creating extra classes just for that context).
An approach that can work sometimes is capturing by value a shared_ptr to a heap-allocated state. This is basically implementing by hand what Python does automatically (but pay attention to reference cycles to avoid memory leaks: Python has a garbage collector, C++ doesn't).
When you are going out of scope, make a copy of the locals you use with capture by value ([=]):
MyType func(void)
{
int x = 5;
//When called, local x will no longer be in scope; so, use capture by value.
return ([=] {
x += 2;
});
}
When you are in the same scope, better to use capture by reference ([&]):
void func(void)
{
int x = 5;
//When called, local x will still be in scope; safe to use capture by reference.
([&] {
x += 2;
})(); //Lambda is immediately invoked here, in the same scope as x, with ().
}

Is this a valid function?

What happens to the reference in function parameter, if it gets destroyed when the function returns, then how const int *i is still a valid pointer?
const int* func(const int &x = 5)
{
return &x;
}
int main()
{
const int *i = func();
}
§12.2/5:
"A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call."
That means as i is being initialized, it's getting the address of a temporary object that does exist at that point. As soon as i is initialized, however, the temporary object will be destroyed, and i will become just another dangling pointer.
As such, yes, the function is valid -- but with the surrounding code as you've written it, any code you added afterward that attempted to dereference i would give undefined behavior.
Just because a pointer has a value doesn't mean it's a valid pointer.
In this case it holds an address which used to be that of x, and chances are that address still has the value 5, but it's not valid pointer and you can't count on that value being there.
int i points to a patch of memory that is unsafe to access, it is not a valid pointer.
the variable "i" is still a pointer, but even reading the value it points to will give you undefined behavior. That's why you should never write a function like func.
I think that x is created as an un-named temporary on the stack in setting up the call to func(). This temporary will exist until at least the end of the statement in the caller. So the int* i is perfectly valid. It only ceases to be valid at the end of the statement - which means that you cannot use it.
There is something in the standard about un-named temporaries being retained until the last reference to them goes out of scope, but I don't think it covers this explicit and hidden indirection.
[ Happy to have someone tell me otherwise.]
5 is program data. It is in the data segment, not the stack or heap.
So a pointer or reference to it will remain valid for the duration of the program.
Default arguments are evaluated every time the function is called, so the call func() is actually func(5) which is binding a temporary to a reference-to-const. The lifetime of that temporary is then extended till the end of the function and the object is destroyed. Any pointer to this object after that is invalid and dereferencing it is undefined behaviour.