Most efficient way to mutate data member - c++

I've been wondering about this question for a long time. What is the most idiomatic and / or efficient way to assign a new value to a data member (mutation)? I can think of 3 options:
Mutate directly from within method
Use a reference to mutate object
Assign the return value of the method to the data member (RVO)
Here's a demo. Please consider while reading the assembly that the compiler optimizes away probably most of the differences in this contrived example, but I just wanted to showcase the options in the most simple way. Please answer for the case where this methods are more involved.
Demo
#include <string>
#include <cstdio>
struct some_struct
{
auto assign_direct() {
str_ = "Hello World!";
}
auto assign_through_ref(std::string& ref) {
ref = "Hello World!";
}
auto assign_through_RVO() {
const std::string ret = "Hello World!";
return ret;
}
void internal_func() {
assign_direct();
assign_through_ref(str_);
str_ = assign_through_RVO();
}
std::string str_;
};
int main()
{
some_struct s;
s.internal_func();
}
My thought is that both, direct assignement and copy assignement must be equally efficient as they dereference the this-pointer and then dereference the effective address of the data member. So two dereferences are involved while the assign_thorugh_ref method only ever uses one dereferencing (except that this must be dereferenced to even call the method, but maybe this can be optimized away by an intelligent compiler).
Also what I want to know is what is most idiomatic / clear and least error prone? Maybe someone with some more years than me can give me some insights here!

Related

Will (N)RVO be applied with my function in this situation?

I have the following code: (ok, in reality it's much more complicated, but I simplified it to make it easier to understand. so please disregard the things that seems stupid. I can't change them in my real situation)
#include <string>
using std::string;
ReportManager g_report_generator;
struct ReportManager
{
// I know, using c_str in this case is stupid.
// but just assume that it has to be this way
string GenerateReport() { string report("test"); return report.c_str(); }
}
string DoIt(bool remove_all)
{
if(g_report_generator.isEmpty())
return string();
string val = g_report_generator.GenerateReport();
if(remove_all)
g_report_generator.clear();
return val;
}
void main()
{
string s = DoIt(true);
}
Will (N)RVO be applied with my functions?
I did a bit of research, and it would seem like it, but I'm not really convinced and I'd like a second opinion (or more).
I'm using Visual Studio 2017.
To solve your problem I rewrote it.
#include <string>
struct string : std::string {
using std::string::string;
string(string&& s) {
exit(-1);
}
string(string const&) {
exit(-2);
}
string() {}
};
struct ReportManager
{
// I know, using c_str in this case is stupid.
// but just assume that it has to be this way
string GenerateReport()
{
string report("test");
return report.c_str();
}
bool isEmpty() const { return true; }
void clear() const {}
};
ReportManager g_report_generator;
string DoIt(bool remove_all)
{
if(g_report_generator.isEmpty())
return string();
string val = g_report_generator.GenerateReport();
if(remove_all)
g_report_generator.clear();
return val;
}
int main()
{
string s = DoIt(true);
}
The trick with this rewriting is that elision permits skipping copy/move ctors. So every time we actually copy an object (even if inlined), we'll insert an exit clause; only by elision can we avoid it.
GenerateReport has no (N)RVO or any kind of elision, other than possibly under as-if. I doubt a compiler will be able to prove that, especially if the string is non-static and large enough to require heap storage.
For DoIt both NRVO and RVO is possible. Elision is legal there, even with side effects.
MSVC fails -- notice calls to
??0string##QAE#$QAU0##Z, which is the move constructor of my local string class.
When I force the possible RVO case to run by saying it is empty, you'll see that the compiler also fails to RVO optimize here; there is an exit(-1) inlined into the disassembly.
Clang manages to RVO the return string(); but not NRVO the return val;.
By far the easiest fix is:
string DoIt(bool remove_all)
{
if(g_report_generator.isEmpty())
return string();
return [&]{
string val = g_report_generator.GenerateReport();
if(remove_all)
g_report_generator.clear();
return val;
}();
}
which has double RVO, and a lambda that does simple NRVO. Zero structural changes to your code, and functions which C++98 compilers can elide return values on (well, they don't support lambda, but you get the idea).
I do not think (N)RVO is possible in either functions. GenerateReport has to construct a string from character array, there is nothing left for NRVO. DoIt returns two different values through it control path, which makes it impossible to perform NRVO as well.

How do I reserve space on the stack for a non-default constructible?

I would basically write the following piece of code. I understand why it can't compile.
A instance; // A is a non-default-constructable type and therefore can't be allocated like this
if (something)
{
instance = A("foo"); // use a constructor X
}
else
{
instance = A(42); // use *another* constructor Y
}
instance.do_something();
Is there a way to achieve this behaviour without involving heap-allocation?
There are better, cleaner ways to solve the problem than explicitly reserving space on the stack, such as using a conditional expression.
However if the type is not move constructible, or you have more complicated conditions that mean you really do need to reserve space on the stack to construct something later in two different places, you can use the solution below.
The standard library provides the aligned_storage trait, such that aligned_storage<T>::type is a POD type of the right size and alignment for storing a T, so you can use that to reserve the space, then use placement-new to construct an object into that buffer:
std::aligned_storage<A>::type buf;
A* ptr;
if (cond)
{
// ...
ptr = ::new (&buf) A("foo");
}
else
{
// ...
ptr = ::new (&buf) A(42);
}
A& instance = *ptr;
Just remember to destroy it manually too, which you could do with a unique_ptr and custom deleter:
struct destroy_A {
void operator()(A* a) const { a->~A(); }
};
std::unique_ptr<A, destroy_A> cleanup(ptr);
Or using a lambda, although this wastes an extra pointer on the stack ;-)
std::unique_ptr<A, void(*)(A*)> cleanup(ptr, [](A* a){ a->~A();});
Or even just a dedicated local type instead of using unique_ptr
struct Cleanup {
A* a;
~Cleanup() { a->~A(); }
} cleanup = { ptr };
Assuming you want to do this more than once, you can use a helper function:
A do_stuff(bool flg)
{
return flg ? A("foo") : A(42);
}
Then
A instance = do_stuff(something);
Otherwise you can initialize using a conditional operator expression*:
A instance = something ? A("foo") : A(42);
* This is an example of how the conditional operator is not "just like an if-else".
In some simple cases you may be able to get away with this standard C++ syntax:
A instance=something ? A("foo"):A(42);
You did not specify which compiler you're using, but in more complicated situations, this is doable using the gcc compiler-specific extension:
A instance=({
something ? A("foo"):A(42);
});
This is a job for placement new, though there are almost certainly simpler solutions you could employ if you revisit your requirements.
#include <iostream>
struct A
{
A(const std::string& str) : str(str), num(-1) {};
A(const int num) : str(""), num(num) {};
void do_something()
{
std::cout << str << ' ' << num << '\n';
}
const std::string str;
const int num;
};
const bool something = true; // change to false to see alternative behaviour
int main()
{
char storage[sizeof(A)];
A* instance = 0;
if (something)
instance = new (storage) A("foo");
else
instance = new (storage) A(42);
instance->do_something();
instance->~A();
}
(live demo)
This way you can construct the A whenever you like, but the storage is still on the stack.
However, you have to destroy the object yourself (as above), which is nasty.
Disclaimer: My weak placement-new example is naive and not particularly portable. GCC's own Jonathan Wakely posted a much better example of the same idea.
std::experimental::optional<Foo> foo;
if (condition){
foo.emplace(arg1,arg2);
}else{
foo.emplace(zzz);
}
then use *foo for access. boost::optional if you do not have the C++1z TS implementation, or write your own optional.
Internally, it will use something like std aligned storage and a bool to guard "have I been created"; or maybe a union. It may be possible for the compiler to prove the bool is not needed, but I doubt it.
An implementation can be downloaded from github or you can use boost.

How do I return an immutable parameter from a method, unchanged, and without a copy in c++?

How do I return a parameter from a method, unchanged, and without a copy in c++?
// This is more or less the desired signature from the caller's point of view
SomeImmutableObject ManipulateIfNecessary(SomeImmutableObject const& existingObject)
{
// Do some work…
// ...
if (manipulationIsNeccessary)
{
// Return a new object with new data etc (preferably without another copy)...
return SomeImmutableObject(...);
}
else
{
// Return the original object intact (but with no further copies!)...
return existingObject;
}
}
An example is C#'s String.Trim method. C# strings are immutable and if Trim doesn't have to do any work, a reference to the existing string is returned, otherwise a new string object with the trimmed content is returned.
How would I mimic this semantic in C++ given something close to the above method signature?
Your object must be a reference type for this to work. Let's give a toy example for strings:
class RefString {
public:
RefString() : ref(new std::string()) { }
RefString(const std::string& str) : ref(new std::string(str)) { }
RefString trim_trailing_newline() {
if (ref->back() == '\n') {
return RefString(ref->substr(0, ref->size()-1));
}
return *this;
}
size_t size() { return ref->size(); }
private:
std::shared_ptr<std::string> ref;
};
int main(int argc, char** argv) {
RefString s("test\n");
std::cout << s.size() << "\n";
std::cout << s.trim_trailing_newline().size() << "\n";
return 0;
}
You may always return const SomeImmutableObject&. Note though that assigning result to an object will invoke a copy.
SomeImmutableObject x = ManipulateIfNecessary(y); // will invoke a copy-ctor
The real trick would be the implementation. When the first "if" clause has an effect you will be presumably returning reference to temporary variable (bad thing to do). The newly created object would have to be dynamically allocated.
All, in all I do not think this is easily possible w/o some smart memory management.
A reasonable option is to implement SomeImmutableObject in a way that supports this - internally as a reference-counted smart-pointer to the logical state, while externally it may provide value semantics. (This can complicate usage from threaded code - you may want to read up on copy-on-write (COW) and why it became unpopular for implementing std::string.)
If you're stuck with an existing SomeImmutableObject implementation you can't change, and you can't wrap it with a reference-counted smart-pointer of sorts, then choices get limited.
It doesn't provide as clean caller usage, but you could make manipulationIsNeccessary a caller-accessible function, then have the caller call the "new object with new data" code - in a second function:
SomeImmutableObject obj;
const SomeImmutableObject& o =
manipulationIsNecessary(obj) ? newObjectWithNewData(obj) : obj;
...use o...
By having newObjectWithNewData be a separate function, you should get return value optimisation kicking in (though it's always best to check with your compiler/settings).

what is a good place to put a const in the following C++ statement

Consider the following class member:
std::vector<sim_mob::Lane *> IncomingLanes_;
the above container shall store the pointer to some if my Lane objects. I don't want the subroutins using this variable as argument, to be able to modify Lane objects.
At the same time, I don't know where to put 'const' keyword that does not stop me from populating the container.
could you please help me with this?
thank you and regards
vahid
Edit:
Based on the answers i got so far(Many Thanks to them all) Suppose this sample:
#include <vector>
#include<iostream>
using namespace std;
class Lane
{
private:
int a;
public:
Lane(int h):a(h){}
void setA(int a_)
{
a=a_;
}
void printLane()
{
std::cout << a << std::endl;
}
};
class B
{
public:
vector< Lane const *> IncomingLanes;
void addLane(Lane *l)
{
IncomingLanes.push_back(l);
}
};
int main()
{
Lane l1(1);
Lane l2(2);
B b;
b.addLane(&l1);
b.addLane(&l2);
b.IncomingLanes.at(1)->printLane();
b.IncomingLanes.at(1)->setA(12);
return 1;
}
What I meant was:
b.IncomingLanes.at(1)->printLane()
should work on IncomingLanes with no problem AND
b.IncomingLanes.at(1)->setA(12)
should not be allowed.(In th above example none of the two mentioned methods work!)
Beside solving the problem, I am loking for good programming practice also. So if you think there is a solution to the above problem but in a bad way, plase let us all know.
Thaks agian
A detour first: Use a smart pointer such shared_ptr and not raw pointers within your container. This would make your life a lot easy down the line.
Typically, what you are looking for is called design-const i.e. functions which do not modify their arguments. This, you achieve, by passing arguments via const-reference. Also, if it is a member function make the function const (i.e. this becomes const within the scope of this function and thus you cannot use this to write to the members).
Without knowing more about your class it would be difficult to advise you to use a container of const-references to lanes. That would make inserting lane objects difficult -- a one-time affair, possible only via initializer lists in the ctor(s).
A few must reads:
The whole of FAQ 18
Sutter on const-correctness
Edit: code sample:
#include <vector>
#include <iostream>
//using namespace std; I'd rather type the 5 characters
// This is almost redundant under the current circumstance
#include <vector>
#include <iostream>
#include <memory>
//using namespace std; I'd rather type the 5 characters
// This is almost redundant under the current circumstance
class Lane
{
private:
int a;
public:
Lane(int h):a(h){}
void setA(int a_) // do you need this?
{
a=a_;
}
void printLane() const // design-const
{
std::cout << a << std::endl;
}
};
class B
{
// be consistent with namespace qualification
std::vector< Lane const * > IncomingLanes; // don't expose impl. details
public:
void addLane(Lane const& l) // who's responsible for freeing `l'?
{
IncomingLanes.push_back(&l); // would change
}
void printLane(size_t index) const
{
#ifdef _DEBUG
IncomingLanes.at( index )->printLane();
#else
IncomingLanes[ index ]->printLane();
#endif
}
};
int main()
{
Lane l1(1);
Lane l2(2);
B b;
b.addLane(l1);
b.addLane(l2);
//b.IncomingLanes.at(1)->printLane(); // this is bad
//b.IncomingLanes.at(1)->setA(12); // this is bad
b.printLane(1);
return 1;
}
Also, as Matthieu M. suggested:
shared ownership is more complicated because it becomes difficult to
tell who really owns the object and when it will be released (and
that's on top of the performance overhead). So unique_ptr should be
the default choice, and shared_ptr a last resort.
Note that unique_ptrs may require you to move them using std::move. I am updating the example to use pointer to const Lane (a simpler interface to get started with).
You can do it this way:
std::vector<const sim_mob::Lane *> IncomingLanes_;
Or this way:
std::vector<sim_mob::Lane const *> IncomingLanes_;
In C/C++, const typename * and typename const * are identical in meaning.
Updated to address updated question:
If really all you need to do is
b.IncomingLanes.at(1)->printLane()
then you just have to declare printLane like this:
void printLane() const // Tell compiler that printLane doesn't change this
{
std::cout << a << std::endl;
}
I suspect that you want the object to be able to modify the elements (i.e., you don't want the elements to truly be const). Instead, you want nonmember functions to only get read-only access to the std::vector (i.e., you want to prohibit changes from outside the object).
As such, I wouldn't put const anywhere on IncomingLanes_. Instead, I would expose IncomingLanes_ as a pair of std::vector<sim_mob::Lane *>::const_iterators (through methods called something like GetIncomingLanesBegin() and GetIncomingLanesEnd()).
you may declare it like:
std::vector<const sim_mob::Lane *> IncomingLanes_;
you will be able to add, or remove item from array, but you want be able to change item see bellow
IncomingLanes_.push_back(someLine); // Ok
IncomingLanes_[0] = someLine; //error
IncomingLanes_[0]->some_meber = someting; //error
IncomingLanes_.erase(IncomingLanes_.end()); //OK
IncomingLanes_[0]->nonConstMethod(); //error
If you don't want other routines to modify IncomingLanes, but you do want to be able to modify it yourself, just use const in the function declarations that you call.
Or if you don't have control over the functions, when they're external, don't give them access to IncomingLanes directly. Make IncomingLanes private and provide a const getter for it.
I don't think what you want is possible without making the pointers stored in the vector const as well.
const std::vector<sim_mob::Lane*> // means the vector is const, not the pointer within it
std::vector<const sim_mob::Lane*> // means no one can modify the data pointed at.
At best, the second version does what you want but you will have this construct throughout your code where ever you do want to modify the data:
const_cast<sim_mob::Lane*>(theVector[i])->non_const_method();
Have you considered a different class hierarchy where sim_mob::Lane's public interface is const and sim_mob::Really_Lane contains the non-const interfaces. Then users of the vector cannot be sure a "Lane" object is "real" without using dynamic_cast?
Before we get to const goodness, you should first use encapsulation.
Do not expose the vector to the external world, and it will become much easier.
A weak (*) encapsulation here is sufficient:
class B {
public:
std::vector<Lane> const& getIncomingLanes() const { return incomingLanes; }
void addLane(Lane l) { incomlingLanes.push_back(l); }
private:
std::vector<Lane> incomingLanes;
};
The above is simplissime, and yet achieves the goal:
clients of the class cannot modify the vector itself
clients of the class cannot modify the vector content (Lane instances)
and of course, the class can access the vector content fully and modify it at will.
Your new main routine becomes:
int main()
{
Lane l1(1);
Lane l2(2);
B b;
b.addLane(l1);
b.addLane(l2);
b.getIncomingLanes().at(1).printLane();
b.getIncomingLanes().at(1).setA(12); // expected-error\
// { passing ‘const Lane’ as ‘this’ argument of
// ‘void Lane::setA(int)’ discards qualifiers }
return 1;
}
(*) This is weak in the sense that even though the attribute itself is not exposed, because we give a reference to it to the external world in practice clients are not really shielded.

How to modify a C++ structure with int *

I have the following structure:
struct CountCarrier
{
int *CurrCount;
};
And this is what I want to do:
int main()
{
CountCarrier carrier = CountCarrier();
*(carrier.CurrCount) = 2; // initialize the *(carrier.CurrCount) to 2
IncreaseCount(&carrier); // should increase the *(carrier.CurrCount) to 3
}
void IncreaseCount(CountCarrier *countCarrier)
{
int *currCounts = countCarrier->CurrCount;
(*currCounts)++;
}
So, my intention is specified in the comments.
However, I couldn't get this to work. For starters, the program throws an exception at this line:
*(carrier.CurrCount) = 2;
And I suspect the following line won't work as well. Anything I did wrong?
struct CountCarrier
{
int *CurrCount; //No memory assigned
};
You need to allocate some valid memory to the pointer inside the structure to be able to put data in this.
Unless you do so, What you ar trying to do is attempting to write at some invalid address, which results in an Undefined Behavior, which luckiy in this case shows up as an exception.
Resolution:
struct CountCarrier
{
int *CurrCount; //No memory assigned
CountCarrier():CurrCount(new(int))
{
}
};
Suggestion:
Stay away from dynamic allocations as long as you can.
When you think of using pointers always think whether you really need one. In this case it doesn't really seem that you need one, A simple int member would be just fine.
You need to create the pointer. ie. carrier->CurrCount = new int;
*(carrier.CurrCount)
This is dereferencing the pointer carrier.CurrCount, but you never initialized it. I suspect this is what you want:
carrier.CurrCount = new int(2);
I seriously doubt that your program throws an exception at the line:
*(carrier.CurrCount) = 2;
While throwing an exception is certainly allowed behaviour, it seems much more likely that you encountered an access violation that caused the process to be killed by the operating system.
The problem is that you are using a pointer, but your pointer is not initialised to point at anything. This means that the result of the pointer dereference is undefined.
In this situation there does not seem to be any advantage to using a pointer at all. Your CurrCount member would work just as well if it was just a plain int.
If you are using C++, then you should encash its facilities. Instead of correcting your code, I am showing here that how the code should look like:
struct CountCarrier
{
int CurrCount; // simple data member
CountCarrier(int count) : CurrCount(count) {} // constructor
CountCarrier& operator ++ () // overloaded operator
{
++ CurrCount;
return *this;
}
};
We are overloading operator ++, because you have only one data member. You can replace with some named method also, like void IncrementCount().
CountCarrier carrier(2);
++ carrier;
As Als said, you need to provide some memory for the code to work.
But why make it so complicated? You don't need any pointers for the code you have to work. The "modern C++" way looks more like this:
struct CountCarrier
{
public:
CountCarrier(int currCount) : currCount(currCount) {}
void IncreaseCount() { ++currCount; }
int GetCount() const { return currCount; }
private:
int currCount;
};
int main()
{
CountCarrier carrier(2); // Initialize carrier.currCount to 2
carrier.IncreaseCount(); // Increment carrier.currCount to 3
}
Note how much cleaner and less error prone that is. Like I said, pick up a good introductory C++ book and read through it.