Immutable "functional" data structure in C++11 - c++

I was trying to write down some implementations for a couple of data structures that I'm interested in for a multithreaded / concurrent scenario.
A lot of functional languages, pretty much all that I know of, design their own data structures in such a way that they are immutable, so this means that if you are going to add value to an instance t1 of T, you really get a new instance of T that packs t1 + value.
container t;
container s = t; //t and s refer to the same container.
t.add(value); //this makes a copy of t, and t is the copy
I can't find the appropriate keywords to do this in C++11; there are keywords, semantics and functions from the standard library that are clearly oriented to the functional approach, in particular I found that:
mutable it's not for runtime, it's more likely to be an hint for the compiler, but this keyword doesn't really help you in designing a new data structure or use a data structure in an immutable way
swap doesn't works on temporaries, and this is a big downside in my case
I also don't know how much the other keywords / functions can help with such design, swap was one of them really close to something good, so I could at least start to write something, but apparently it's limited to lvalues .
So I'm asking: it's possible to design immutable data structure in C++11 with a functional approach ?

You simply declare a class with private member variables and you don't provide any methods to change the value of these private members. That's it. You initialize the members only from the constructors of the class. Noone will be able to change the data of the class this way. The tool of C++ to create immutable objects is the private visibility of the members.
mutable: This is one of the biggest hacks in C++. I've seen at most 2 places in my whole life where its usage was reasonable and this keyword is pretty much the opposite of what you are searching for. If you would search for a keyword in C++ that helps you at compile time to mark data members then you are searching for the const keyword. If you mark a class member as const then you can initialize it only from the INITIALIZER LIST of constructors and you can no longer modify them throughout the lifetime of the instance. And this is not C++11, it is pure C++. There are no magic language features to provide immutability, you can do that only by programming smartly.

In c++ "immutability" is granted by the const keyword. Sure - you still can change a const variable, but you have to do it on purpose (like here). In normal cases, the compiler won't let you do that. Since your biggest concern seems to be doing it in a functional style, and you want a structure, you can define it yourself like this:
class Immutable{
Immutable& operator=(const Immutable& b){} // This is private, so it can't be called from outside
const int myHiddenValue;
public:
operator const int(){return myHiddenValue;}
Immutable(int valueGivenUponCreation): myHiddenValue(valueGivenUponCreation){}
};
If you define a class like that, even if you try to change myHiddenValue with const_cast, it won't actually do anything, since the value will be copied during the call to operator const int.
Note: there's no real reason to do this, but hey - it's your wish.
Also note: since pointers exist in C++, you still can change the value with some kind of pointer magic (get the address of the object, calc the offset, etc), but you can't really help that. You wouldn't be able to prevent that even when using an functional language, if it had pointers.
And on a side note - why are you trying to force yourself in using C++ in a functional manner? I can understand it's simpler for you, and you're used to it, but functional programming isn't often used because of its downfalls. Note that whenever you create a new object, you have to allocate space. It's slower for the end-user.

Bartoz Milewski has implemented Okasaki's functional data structures in C++. He gives a very thorough treatise on why functional data structures are important for concurrency. In that treatise, he explains the need in concurrency to construct an object and then afterwards make it immutable:
Here’s what needs to happen: A thread has to somehow construct the
data that it destined to be immutable. Depending on the structure of
that data, this could be a very simple or a very complex process. Then
the state of that data has to be frozen — no more changes are
allowed.
As others have said, when you want to expose data in C++ and have it not be available for changing, you make your function signature look like this:
class MutableButExposesImmutably
{
private:
std::string member;
public:
void complicatedProcess() { member = "something else"; } // mutates
const std::string & immutableAccessToMember() const {
return member;
}
};
This is an example of a data structure that is mutable, but you can't mutate it directly.
I think what you are looking for is something like java's final keyword: This keyword allows you to construct an object, but thereafter the object remains immutable.
You can do this in C++. The following code sample compiles. Note that in the class Immutable, the object member is literally immutable, (unlike what it was in the previous example): You can construct it, but once constructed, it is immutable.
#include <iostream>
#include <string>
using namespace std;
class Immutable
{
private:
const std::string member;
public:
Immutable(std::string a) : member(a) {}
const std::string & immutable_member_view() const { return member; }
};
int main() {
Immutable foo("bar");
// your code goes here
return 0;
}

Re. your code example with s and t. You can do this in C++, but "immutability" has nothing to do with that question, if I understand your requirements correctly!
I have used containers in vendor libraries that do operate the way you describe; i.e. when they are copied they share their internal data, and they don't make a copy of the internal data until it's time to change one of them.
Note that in your code example, there is a requirement that if s changes then t must not change. So s has to contain some sort of flag or reference count to indicate that t is currently sharing its data, so when s has its data changed, it needs to split off a copy instead of just updating its data.
So, as a very broad outline of what your container will look like: it will consist of a handle (e.g. a pointer) to some data, plus a reference count; and your functions that update the data all need to check the refcount to decide whether to reallocate the data or not; and your copy-constructor and copy-assignment operator need to increment the refcount.

Related

C++ const accessors and references best practice

In attempting to brush up on my C++, I've been trying to find out the best-practice way of creating accessors.
I want to clarify my understanding and find out if what I'm doing is right. I have several questions, but they seem pretty simple so I've rolled them all into this one Stack Overflow question.
The following is some example code representing what I 'think' is the correct way of doing things:
class MyClass
{
private:
std::string StringMember_;
int IntMember_;
public:
MyClass(const std::string &stringInput, const int &intInput) : StringMember_(stringInput), IntMember_(intInput)
{
}
const std::string &StringMember() const
{
return StringMember_;
}
void StringMember(const std::string &stringInput)
{
StringMember_ = stringInput;
}
const int &IntMember() const
{
return IntMember_;
}
void IntMember(const int &intInput)
{
IntMember_ = intInput;
}
};
My questions are:
Where my accessors return a const reference variable, ie const std::string, this means that it (my class's member variable) cannot be changed. Is that correct?
The last const before a method's body indicates that no members of the class for which that method is a part of can be altered, unless they are designated mutable. Is this also correct?
Where I'm passing in const method parameters, this means that I ensure these parameters are always stored as they were passed in, thus protecting any original variables being passed in from being altered by the method body. Is this also correct?
With regards to the mutable keyword, under what circumstances would I actually want to use this? I've been struggling to think of a good scenario where I'd have a const method that needed to modify class members.
Accessors seem like a good idea, even for data that will never be publicly exposed, because it ensures a single-point of entry, allowing for easier debugging and so on. Am I thinking along the right lines here, or is this in fact totally meaningless, and that there is no need for private accessors?
From a purely syntactical perspective, should I be writing my references like const int& intInput or const int &intInput. Does it really matter where the ampersand is, or is it just a matter of personal preference?
Finally, is what I'm doing in the example above good practice? I plan to start working on a larger personal project, and I want to have these core basics down before I start running into problems later.
I was using this as a reference: https://isocpp.org/wiki/faq/const-correctness
Where my accessors return a const reference variable, ie const std::string, this means that it (my class's member variable) cannot be changed. Is that correct?
Correct. A variable cannot be changed through a const reference.
The last const before a method's body indicates that no members of the class for which that method is a part of can be altered, unless they are designated mutable. Is this also correct?
Correct. It also allows the function to be called on a const object.
Where I'm passing in const method parameters, this means that I ensure these parameters are always stored as they were passed in, thus protecting any original variables being passed in from being altered by the method body. Is this also correct?
Correct. Same can be achieved with accepting the argument by value.
With regards to the mutable keyword, under what circumstances would I actually want to use this?
See When have you used C++ 'mutable' keyword?
Accessors seem like a good idea, even for data that will never be publicly exposed, because it ensures a single-point of entry, allowing for easier debugging and so on. Am I thinking along the right lines here
I don't buy this argument. Watchpoints allow for easy debugging of member variables regardless of where they're accessed from.
From a purely syntactical perspective, should I be writing my references like const int& intInput or const int &intInput.
Both are syntactically equivalent and the choice between them is purely aesthetic.
Finally, is what I'm doing in the example above good practice?
There is no general answer. Accessors are sometimes useful. Often they're redundant. If you provide a function that allows setting the value directly, such as you do here, then you might as well get rid of the accessors and make the member public.
Seems to me like you have a pretty good handle on the concepts here. As far as a mutable example there are lots, here's one: you have a search method, and for performance reasons you cache the last search results... that internal cache would need to be mutable for a const search method. I.e. the external behavior didn't change, but internally something might change.
Here is some examples for mutable:
memoiziation caches, for when something is referencially-transparent,
but expensive to calculate, the first call to the (const-qualified)
accessor calculates the value and stores it in a mutable member hash
table, second and subsequent calls fetch the value from the table
instead.
access counters, timing, loggers, and other instrumentation that needs
to change some state when a const-qualified accessor is called
From https://www.quora.com/When-should-I-actually-use-a-mutable-keyword-in-C++

Can I use const references instead of getter functions?

I just wondered if I could bypass using getters if I just allowed a const reference variable, as follows
#include <string>
class cTest
{
private:
int m_i;
std::string m_str;
public:
const int & i;
const std::string & str;
cTest(void)
: i(m_i)
, str(m_str)
{}
};
int main(int argc, char *argv[])
{
cTest o;
int i = o.i; // works
o.i += 5; // fails
o.str.clear(); // fails
return 0;
}
I wonder why people do not seem to do this at all. Is there some severe disadvantage I am missing? Please contribute to the list of advantages and disadvantages, and correct them if necessary.
Advantages:
There is no overhead through calls of getter functions.
The program size is decreased because there are less functions.
I can still modify the internals of the class, the reference variables provide a layer of abstraction.
Disadvantages:
Instead of getter functions, I have a bunch of references. This increases the object size.
Using const_cast, people can mess up private members, but these people are mischievous, right?
Some severe disadvantages indeed (aside from the 2nd disadvantage that you also mention which I also put in the "severe" category):
1) You'll need to supply (and therefore maintain) a copy constructor: the compiler default will not work.
2) You'll need to supply an assignment operator: the compiler default will not work.
3) Think carefully about implementing the move semantics. Again, the compiler default will not work.
These three things mean the const reference anti-pattern you propose is a non-starter. Don't do it!
One advantage of getter functions is that you might at some point in time - want to alter returned value - and without getter function you cannot do it. This scenerio would require you to return non reference actually, which is less common in c++. [edit] but with move semantics instead of references this should be doable[/edit]
You might also want to put a breakpoint into getter function to learn who is reading its value, you might want to add logging, etc. This is called
encapsulation.
Other advantage of getter is that in debug builds you can add additional checks/asserts on returned data.
In the end compiler will inline your getter functions, which will result in similar code to the one you propose.
some additional disadvantage:
1) template code will want to get values using function call, ie. size(), if you change it to const& variable then you will not be able to use it in some templates. So this is a consistency problem.
If you want to avoid getters and setters, using a const reference member is not the solution.
Instead, you'll want to ensure const correctness on the surrounding struct (which automagically gives you const access to the members), and just let the members be whatever they logically need to be.
Be sure to read up on when getters and setters can, should, or could be switched with public data members. See e.g. this question. Just note that if you change the interface, the heralded dvantage of setters/getters is that calling the getter won't affect call sites. Reality seems to argue otherwise, and e.g. refactoring a member along with all its access points is a trivial operation for any self-respecting C++ code editor.
Although one could argue for encapsulation, I'd more strongly argue for const correctness, which alleviates the need for much encapsulation and really simplifies code quite a lot.

C++ Getter/Setter (Alternatives?)

Okay, just about everywhere I read, I read that getters/setters are "evil".
Now, as a programmer who uses getters/setters often in PHP / C#, I do not see how they are alive. I have read that they break encapsulation, etc etc, however, here is a simple example.
class Armor{
int armorValue;
public:
Armor();
Armor(int); //int here represents armor value
int GetArmorValue();
void SetArmorValue(int);
};
Now, lets say getters and setters are "evil".
How are you supposed to change a member variable after initialization.
Example:
Armor arm=Armor(128); //armor with 128 armor value
//for some reason I would like to change this armor value
arm.SetArmorValue(55); //if i do not use getters / setters how is this possible?
Lets say the above is not okay, for whatever reason.
What if my game restricts armor values from 1 to 500. (No armor can have a piece that has more than 500 armor or less than 1 armor).
Now my implementation becomes
void Armor::SetArmor(int tArmValue){
if (tArmValue>=1 && tArmValue<=500)
armorValue=tArmValue;
else
armorValue=1;
}
So, how else would I impose this restriction without using getters/setters?
How else would I modify a property without using getters/setters?
Should armorValue just be a public member variable in case 1, and the getters/setters used in case 2?
Curious. THanks guys
You have misunderstood something. Not using getters/setters breaks encapsulation and exposes implementation details, and can be considered "evil" for some definition of evil.
I guess they can be considered evil in the sense, that without proper IDE/editor support, they are somewhat tediois to write in C++...
One pitfall of C++ is to create non-const reference getter, which allows also modification. That's same as returning a pointer to internal data, and will lock that part of internal implementation, and is really no better than making field public.
Edit: based on comments and other answers, what you heard probably refers to always creating non-private getter and setter for every field. But I would not call that evil either, just stupid ;-)
Being slightly contrarian: yes, getters and setters (aka accessors and mutators) are mostly evil.
The evil here is not, IMO, so much from "breaking encapsulation", as from simply defining a variable to be of one type (e.g., int) when it's really not that type at all. Looking at your example, you're calling Armor an int, but it's really not. While it's undoubtedly an integer, it's certainly not an int, which (among other things) defines a range. While your type is an integer, it's never intended to support the same range as an int at all. If you want Armor to be of a type integer from 1 to 500, define a type to represent that directly, and define Armor as an instance of that type. In this case, since the invariant you want to enforce is defined as part of the type itself, you don't need to tack a setter onto it to try to enforce it.
template <class T, class less=std::less<T> >
class bounded {
const T lower_, upper_;
T val_;
bool check(T const &value) {
return less()(value, lower_) || less()(upper_, value);
}
void assign(T const &value) {
if (check(value))
throw std::domain_error("Out of Range");
val_ = value;
}
public:
bounded(T const &lower, T const &upper)
: lower_(lower), upper_(upper) {}
bounded(bounded const &init)
: lower_(init.lower), upper_(init.upper), val_(init.val_)
{ }
bounded &operator=(T const &v) { assign(v); return *this; }
operator T() const { return val_; }
friend std::istream &operator>>(std::istream &is, bounded &b) {
T temp;
is >> temp;
if (b.check(temp))
is.setstate(std::ios::failbit);
else
b.val_ = temp;
return is;
}
};
With this in place, defining some armor with a range of 1..500 becomes utterly trivial:
bounded<int> armor(1, 500);
Depending on the situation, you might prefer to define (for example) a saturating type where attempting to assign an out of range value is fine, but the value that actually is assigned will simply be the nearest value that is within range.
saturating<int> armor(1, 500);
armor = 1000;
std::cout << armor; // prints "500"
Of course, what I've given above is also a bit bare-bones. For your armor type, it would probably be convenient to support -= (and possibly +=) so an attack would end up something like x.armor -= 10;.
Bottom line: the (or at least "one") major problem with getters and setters is that they usually point to your having defined a variable as being of one type when you really want some other type that happened to be sort of similar in a few ways.
Now, it's true that some languages (e.g., Java) fail to provide the programmer with the tools necessary to write code like that. Here I'm trusting your use of the C++ tag to indicate that you really do want to write C++ though. C++ does provide you with the necessary tools, and (at least IMO) your code will be better off for your making good use of the tools it provides so your type enforces the required semantic constraints while still using clean, natural, readable syntax.
In short: they aren't evil.
It's nothing wrong with them as long as they don't leak out the internal representation. I see no problems here.
A common criticism of get/set functions is that they can be abused by client code to perform operations that logically should be encapsulated in the class. For example, say a client wants to "polish" their armour, and decides the effect is to increase "value" by 20, so they do their little get and set thing and are happy. Then someone other client code elsewhere decides rusty armour should drop the value by 30, and they do their bit. Meanwhile, a dozen other places in client code are also allowing polishing and rusting effects on armour - as well as say "reinforcing" and "cracking", and implementing them directly. There's no central control of this... the maintainer of the armour class has no ability to do things like:
have the rust, polish, reinforce and crack effects apply at most once per piece of armour
tune the number added to or subtract from value for specific logical effects
decide that the new "leather" armour type can't rust, and ignore client attempts to make it do so
On the other hand, if the first client that wanted to make armour rusty couldn't do so through the interface, they'd go to the maintainer of the armour class and say "hey, give me a function to do this", then other people could start using the logical-level "rust" operation, and if it became useful later to do the kinds of things I describe above they could be implemented easily and centrally in the armour class (e.g. by having a separate boolean to say if the armour was rusty, or a separate variable recording the rust effect).
So, the thing with get/set functions is they frustrate the natural evolution of an API of logical functionality, instead distributing logic throughout client code, leading in extremis to an unmaintainable mess.
Your getter/setter looks ok.
The alternative to getter/setters is to make member variables public. To be more precise, group variables into structure without member functions. And operate on this structure within your class
Giving access to members reduces encapsulation, but sometimes it's necessary. And the best way to do it is by means of getters and setters. Some people implement them when no such access is necessary, just because they can and it's a habit.
Getters are evil whenever:
They access directly data members of the class
When you have to add new getter every time you add data to the class
The data behaviour is different in each getter
Good getters would thus do the following:
They forward the request to some other object or collect the data from several places
You can fetch large amounts of data using just one getter
All the data you fetch is handled the same way
Setters on the other hand are evil always.
how else would I impose this restriction without using getters/setters? How else would I modify a property without using getters/setters?
You can check what you read from the variable and if its value is out of range use a predefined value instead (if possible).
You can also resort to dirty hacks such as protecting the memory underneath the variable from writing, catching write attempts and disallowing/ignoring the ones with invalid values. This is going to be cumbersome to implement and expensive to execute. It may be useful for debugging, though.

C++ should all member variable use accessors and mutator

I have about 15~20 member variables which needs to be accessed, I was wondering
if it would be good just to let them be public instead of giving every one of them
get/set functions.
The code would be something like
class A { // a singleton class
public:
static A* get();
B x, y, z;
// ... a lot of other object that should only have one copy
// and doesn't change often
private:
A();
virtual ~A();
static A* a;
};
I have also thought about putting the variables into an array, but I don't
know the best way to do a lookup table, would it be better to put them in an array?
EDIT:
Is there a better way than Singleton class to put them in a collection
The C++ world isn't quite as hung up on "everything must be hidden behind accessors/mutators/whatever-they-decide-to-call-them-todays" as some OO-supporting languages.
With that said, it's a bit hard to say what the best approach is, given your limited description.
If your class is simply a 'bag of data' for some other process, than using a struct instead of a class (the only difference is that all members default to public) can be appropriate.
If the class actually does something, however, you might find it more appropriate to group your get/set routines together by function/aspect or interface.
As I mentioned, it's a bit hard to tell without more information.
EDIT: Singleton classes are not smelly code in and of themselves, but you do need to be a bit careful with them. If a singleton is taking care of preference data or something similar, it only makes sense to make individual accessors for each data element.
If, on the other hand, you're storing generic input data in a singleton, it might be time to rethink the design.
You could place them in a POD structure and provide access to an object of that type :
struct VariablesHolder
{
int a;
float b;
char c[20];
};
class A
{
public:
A() : vh()
{
}
VariablesHolder& Access()
{
return vh;
}
const VariablesHolder& Get() const
{
return vh;
}
private:
VariablesHolder vh;
};
No that wouldn't be good. Image you want to change the way they are accessed in the future. For example remove one member variable and let the get/set functions compute its value.
It really depends on why you want to give access to them, how likely they are to change, how much code uses them, how problematic having to rewrite or recompile that code is, how fast access needs to be, whether you need/want virtual access, what's more convenient and intuitive in the using code etc.. Wanting to give access to so many things may be a sign of poor design, or it may be 100% appropriate. Using get/set functions has much more potential benefit for volatile (unstable / possibly subject to frequent tweaks) low-level code that could be used by a large number of client apps.
Given your edit, an array makes sense if your client is likely to want to access the values in a loop, or a numeric index is inherently meaningful. For example, if they're chronologically ordered data samples, an index sounds good. Summarily, arrays make it easier to provide algorithms to work with any or all of the indices - you have to consider whether that's useful to your clients; if not, try to avoid it as it may make it easier to mistakenly access the wrong values, particularly if say two people branch some code, add an extra value at the end, then try to merge their changes. Sometimes it makes sense to provide arrays and named access, or an enum with meaningful names for indices.
This is a horrible design choice, as it allows any component to modify any of these variables. Furthermore, since access to these variables is done directly, you have no way to impose any invariant on the values, and if suddenly you decide to multithread your program, you won't have a single set of functions that need to be mutex-protected, but rather you will have to go off and find every single use of every single data member and individually lock those usages. In general, one should:
Not use singletons or global variables; they introduce subtle, implicit dependencies between components that allow seemingly independent components to interfere with each other.
Make variables const wherever possible and provide setters only where absolutely required.
Never make variables public (unless you are creating a POD struct, and even then, it is best to create POD structs only as an internal implementation detail and not expose them in the API).
Also, you mentioned that you need to use an array. You can use vector<B> or vector<B*> to create a dynamically-sized array of objects of type B or type B*. Rather than using A::getA() to access your singleton instance; it would be better to have functions that need type A to take a parameter of type const A&. This will make the dependency explicit, and it will also limit which functions can modify the members of that class (pass A* or A& to functions that need to mutate it).
As a convention, if you want a data structure to hold several public fields (plain old data), I would suggest using a struct (and use in tandem with other classes -- builder, flyweight, memento, and other design patterns).
Classes generally mean that you're defining an encapsulated data type, so the OOP rule is to hide data members.
In terms of efficiency, modern compilers optimize away calls to accessors/mutators, so the impact on performance would be non-existent.
In terms of extensibility, methods are definitely a win because derived classes would be able to override these (if virtual). Another benefit is that logic to check/observe/notify data can be added if data is accessed via member functions.
Public members in a base class is generally a difficult to keep track of.

C++: Copy constructor: Use getters or access member vars directly?

I have a simple container class with a copy constructor.
Do you recommend using getters and setters, or accessing the member variables directly?
public Container
{
public:
Container() {}
Container(const Container& cont) //option 1
{
SetMyString(cont.GetMyString());
}
//OR
Container(const Container& cont) //option 2
{
m_str1 = cont.m_str1;
}
public string GetMyString() { return m_str1;}
public void SetMyString(string str) { m_str1 = str;}
private:
string m_str1;
}
In the example, all code is inline, but in our real code there is no inline code.
Update (29 Sept 09):
Some of these answers are well written however they seem to get missing the point of this question:
this is simple contrived example to discuss using getters/setters vs variables
initializer lists or private validator functions are not really part of this question. I'm wondering if either design will make the code easier to maintain and expand.
Some ppl are focusing on the string in this example however it is just an example, imagine it is a different object instead.
I'm not concerned about performance. we're not programming on the PDP-11
EDIT: Answering the edited question :)
this is simple contrived example to
discuss using getters/setters vs
variables
If you have a simple collection of variables, that don't need any kind of validation, nor additional processing then you might consider using a POD instead. From Stroustrup's FAQ:
A well-designed class presents a clean
and simple interface to its users,
hiding its representation and saving
its users from having to know about
that representation. If the
representation shouldn't be hidden -
say, because users should be able to
change any data member any way they
like - you can think of that class as
"just a plain old data structure"
In short, this is not JAVA. you shouldn't write plain getters/setters because they are as bad as exposing the variables them selves.
initializer lists or private validator functions are not really
part of this question. I'm wondering
if either design will make the code
easier to maintain and expand.
If you are copying another object's variables, then the source object should be in a valid state. How did the ill formed source object got constructed in the first place?! Shouldn't constructors do the job of validation? aren't the modifying member functions responsible of maintaining the class invariant by validating input? Why would you validate a "valid" object in a copy constructor?
I'm not concerned about performance. we're not programming on the PDP-11
This is about the most elegant style, though in C++ the most elegant code has the best performance characteristics usually.
You should use an initializer list. In your code, m_str1 is default constructed then assigned a new value. Your code could be something like this:
class Container
{
public:
Container() {}
Container(const Container& cont) : m_str1(cont.m_str1)
{ }
string GetMyString() { return m_str1;}
void SetMyString(string str) { m_str1 = str;}
private:
string m_str1;
};
#cbrulak You shouldn't IMO validate cont.m_str1 in the copy constructor. What I do, is to validate things in constructors. Validation in copy constructor means you you are copying an ill formed object in the first place, for example:
Container(const string& str) : m_str1(str)
{
if(!valid(m_str1)) // valid() is a function to check your input
{
// throw an exception!
}
}
You should use an initializer list, and then the question becomes meaningless, as in:
Container(const Container& rhs)
: m_str1(rhs.m_str1)
{}
There's a great section in Matthew Wilson's Imperfect C++ that explains all about Member Initializer Lists, and about how you can use them in combination with const and/or references to make your code safer.
Edit: an example showing validation and const:
class Container
{
public:
Container(const string& str)
: m_str1(validate_string(str))
{}
private:
static const string& validate_string(const string& str)
{
if(str.empty())
{
throw runtime_error("invalid argument");
}
return str;
}
private:
const string m_str1;
};
As it's written right now (with no qualification of the input or output) your getter and setter (accessor and mutator, if you prefer) are accomplishing absolutely nothing, so you might as well just make the string public and be done with it.
If the real code really does qualify the string, then chances are pretty good that what you're dealing with isn't properly a string at all -- instead, it's just something that looks a lot like a string. What you're really doing in this case is abusing the type system, sort of exposing a string, when the real type is only something a bit like a string. You're then providing the setter to try to enforce whatever restrictions the real type has compared to a real string.
When you look at it from that direction, the answer becomes fairly obvious: rather than a string, with a setter to make the string act like some other (more restricted) type, what you should be doing instead is defining an actual class for the type you really want. Having defined that class correctly, you make an instance of it public. If (as seems to be the case here) it's reasonable to assign it a value that starts out as a string, then that class should contain an assignment operator that takes a string as an argument. If (as also seems to be the case here) it's reasonable to convert that type to a string under some circumstances, it can also include cast operator that produces a string as the result.
This gives a real improvement over using a setter and getter in a surrounding class. First and foremost, when you put those in a surrounding class, it's easy for code inside that class to bypass the getter/setter, losing enforcement of whatever the setter was supposed to enforce. Second, it maintains a normal-looking notation. Using a getter and a setter forces you to write code that's just plain ugly and hard to read.
One of the major strengths of a string class in C++ is using operator overloading so you can replace something like:
strcpy(strcat(filename, ".ext"));
with:
filename += ".ext";
to improve readability. But look what happens if that string is part of a class that forces us to go through a getter and setter:
some_object.setfilename(some_object.getfilename()+".ext");
If anything, the C code is actually more readable than this mess. On the other hand, consider what happens if we do the job right, with a public object of a class that defines an operator string and operator=:
some_object.filename += ".ext";
Nice, simple and readable, just like it should be. Better still, if we need to enforce something about the string, we can inspect only that small class, we really only have to look one or two specific, well-known places (operator=, possibly a ctor or two for that class) to know that it's always enforced -- a totally different story from when we're using a setter to try to do the job.
Do you anticipate how the string is returned, eg. white space trimmed, null checked, etc.? Same with SetMyString(), if the answer is yes, you are better off with access methods since you don't have to change your code in zillion places but just modify those getter and setter methods.
Ask yourself what the costs and benefits are.
Cost: higher runtime overhead. Calling virtual functions in ctors is a bad idea, but setters and getters are unlikely to be virtual.
Benefits: if the setter/getter does something complicated, you're not repeating code; if it does something unintuitive, you're not forgetting to do that.
The cost/benefit ratio will differ for different classes. Once you're ascertained that ratio, use your judgment. For immutable classes, of course, you don't have setters, and you don't need getters (as const members and references can be public as no one can change/reseat them).
There's no silver bullet as how to write the copy constructor.
If your class only has members which provide a copy constructor that creates
instances which do not share state (or at least do not appear to do so) using an initializer list is a good way.
Otherwise you'll have to actually think.
struct alpha {
beta* m_beta;
alpha() : m_beta(new beta()) {}
~alpha() { delete m_beta; }
alpha(const alpha& a) {
// need to copy? or do you have a shared state? copy on write?
m_beta = new beta(*a.m_beta);
// wrong
m_beta = a.m_beta;
}
Note that you can get around the potential segfault by using smart_ptr - but you can have a lot of fun debugging the resulting bugs.
Of course it can get even funnier.
Members which are created on demand.
new beta(a.beta) is wrong in case you somehow introduce polymorphism.
... a screw the otherwise - please always think when writing a copy constructor.
Why do you need getters and setters at all?
Simple :) - They preserve invariants - i.e. guarantees your class makes, such as "MyString always has an even number of characters".
If implemented as intended, your object is always in a valid state - so a memberwise copy can very well copy the members directly without fear of breaking any guarantee. There is no advantage of passing already validated state through another round of state validation.
As AraK said, the best would be using an initializer list.
Not so simple (1):
Another reason to use getters/setters is not relying on implementation details. That's a strange idea for a copy CTor, when changing such implementation details you almost always need to adjust CDA anyway.
Not so simple (2):
To prove me wrong, you can construct invariants that are dependent on the instance itself, or another external factor. One (very contrieved) example: "if the number of instances is even, the string length is even, otherwise it's odd." In that case, the copy CTor would have to throw, or adjust the string. In such a case it might help to use setters/getters - but that's not the general cas. You shouldn't derive general rules from oddities.
I prefer using an interface for outer classes to access the data, in case you want to change the way it's retrieved. However, when you're within the scope of the class and want to replicate the internal state of the copied value, I'd go with data members directly.
Not to mention that you'll probably save a few function calls if the getter are not inlined.
If your getters are (inline and) not virtual, there's no pluses nor minuses in using them wrt direct member access -- it just looks goofy to me in terms of style, but, no big deal either way.
If your getters are virtual, then there is overhead... but nevertheless that's exactly when you DO want to call them, just in case they're overridden in a subclass!-)
There is a simple test that works for many design questions, this one included: add side-effects and see what breaks.
Suppose setter not only assigns a value, but also writes audit record, logs a message or raises an event. Do you want this happen for every property when copying object? Probably not - so calling setters in constructor is logically wrong (even if setters are in fact just assignments).
Although I agree with other posters that there are many entry-level C++ "no-no's" in your sample, putting that to the side and answering your question directly:
In practice, I tend to make many but not all of my member fields* public to start with, and then move them to get/set when needed.
Now, I will be the first to say that this is not necessarily a recommended practice, and many practitioners will abhor this and say that every field should have setters/getters.
Maybe. But I find that in practice this isn't always necessary. Granted, it causes pain later when I change a field from public to a getter, and sometimes when I know what usage a class will have, I give it set/get and make the field protected or private from the start.
YMMV
RF
you call fields "variables" - I encourage you to use that term only for local variables within a function/method