How to add a asString() / toString() to std::vector? - c++

My situation is a follows: There is some class MyList that will probably get a specific implemenation later on. For now, behavior like std::vector is fine.
However, I really need an easy way to call some kind of asString() / toString() method on it, because I'll need it in test assertions, debug output and so on. The only options I see are:
Public inheritence. I'll never delete such a list through a base-pointer, since there should never be any base pointers. If I do, there will be no pointer members, anyway. However, rule of thumb still states: Don't inherit from stl containers.
Some kind of "global" (actually in a namespace, of course) method that takes an instance of MyList as argument and does the asString() magic for me. In that case, MyList could be a simple typedef for std::vector.
I like neither of those options too much. Is there something else I failed to think of? Or if not - which way should I prefer?

what is wrong about the second approach? that is by far the easiest and also pretty elegant.-
Imagine the alternative of wrapping the vector. that would cause you alot of extra work and glue code that is error prone! I'd go with the function approach for sure!
edit: btw, i almost exclusively use free functions(sometimes static members) for conversions. Imagine you have a load of types that somehow need to be convertible to string. Having the toString() functions as free functions and not as members does not give you the headache you are heaving right now since you can basically simply overload the function as much as you want and don't have to touch any existing classes (or maybe classes that you don't even have source access to).
Then you can have a function like:
template<class T>
void printDebugInfo(const T & _obj)
{
std::cout<<toString(_obj)<<std::endl;
}
and you wont have the constraints you are experiencing.

Actually, free functions upon class types are a standard technique and are considered as part of the interface of a type. Read this GotW by Herb Sutter, one of people that have a voice in C++ standardization.
In general, prefer free functions over member functions. This increases encapsulation and re-usability and reduces class bloat and coupling. See this article by Scott Meyers for deeper information (highly regarded for his C++ books that you should definitely read if you want to improve your effective and clean use of C++).
Also note that you should never derive from STL containers. They are not designed as base classes and you might easily invoke undefined behaviour. But see Is there any real risk to deriving from the C++ STL containers? .

I think having a free
std::string toString( const MyList &l );
function is perfectly fine. If you are afraid of name clashes, you can consider a namespace as you said. This function is highly decoupled, and won't be able to tinker with private members of MyList objects (as is the case for a member or a friend function).
The only reason which would justify not making it a free function: you notice that you suddenly need to extend the public interface of MyList a lot just to be able to implement toString properly. In that case, I'd make it a friend function.

If you did something like:
template<typename T>
std::ostream& operator<< (std::ostream &strm, const MyList<T> &list)
{
if (list.empty())
return strm;
MyList<T>::const_iterator iter = list.begin(),
end = list.end();
// Write the first value
strm << *iter++;
while (iter != end)
strm << "," << *iter++;
return strm;
}
Then you would essentially have a to string for anything in the list, as long as the elements implement the streaming operator

Have you considered composition, as opposed to inheritance? i.e. Your MyList has a member variable of type std::vector.
You may complain that you will now need to replicate the API of std::vector in MyList. But you say that you might change the implementation later, so you'll need to do that anyway. You may as well do it straight away, to avoid having to change all the client code later on.

Inheritance is completely wrong in this case.
Global function approach is perfectly fine.
One of 'the ways' in C++ is to overload operator << and use stringstream, for example, to output your vector or something else.

I would go with global template function printOnStream. That way you can easily add support for other data types and using stream is more general than creating a string.
I would not use inheritance because there might be some tricky cases. Basic thing is as you mentioned - lack of virtual destructor. But also everything that expects std::vector won't work properly with your data type - for example you can run into slicing problem.

Why don't your debug and assert methods do this for you?

Related

Subclass/inherit standard containers?

I often read this statements on Stack Overflow. Personally, I don't find any problem with this, unless I am using it in a polymorphic way; i.e. where I have to use virtual destructor.
If I want to extend/add the functionality of a standard container then what is a better way than inheriting one? Wrapping those container inside a custom class requires much more effort and is still unclean.
There are a number of reasons why this a bad idea.
First, this is a bad idea because the standard containers do not have virtual destructors. You should never use something polymorphically that does not have virtual destructors, because you cannot guarantee cleanup in your derived class.
Basic rules for virtual dtors
Second, it is really bad design. And there are actually several reasons it is bad design. First, you should always extend the functionality of standard containers through algorithms that operate generically. This is a simple complexity reason - if you have to write an algorithm for every container it applies to and you have M containers and N algorithms, that is M x N methods you must write. If you write your algorithms generically, you have N algorithms only. So you get much more reuse.
It is also really bad design because you are breaking a good encapsulation by inheriting from the container. A good rule of thumb is: if you can perform what you need using the public interface of a type, make that new behavior external to the type. This improves encapsulation. If it's a new behavior you want to implement, make it a namespace scope function (like the algorithms). If you have a new invariant to impose, use containment in a class.
A classic description of encapsulation
Finally, in general, you should never think about inheritance as a means to extend the behavior of a class. This is one of the big, bad lies of early OOP theory that came about due to unclear thinking about reuse, and it continues to be taught and promoted to this day even though there is a clear theory why it is bad. When you use inheritance to extend behavior, you are tying that extended behavior to your interface contract in a way that ties users hands to future changes. For instance, say you have a class of type Socket that communicates using the TCP protocol and you extend it's behavior by deriving a class SSLSocket from Socket and implementing the behavior of the higher SSL stack protocol on top of Socket. Now, let's say you get a new requirement to have the same protocol of communications, but over a USB line, or over telephony. You would need to cut and paste all that work to a new class that derives from a USB class, or a Telephony class. And now, if you find a bug, you have to fix it in all three places, which won't always happen, which means bugs will take longer and not always get fixed...
This is general to any inheritance hierarchy A->B->C->... When you want to use the behaviors you've extended in derived classes, like B, C, .. on objects not of the base class A, you've got to redesign or you are duplicating implementation. This leads to very monolithic designs that are very hard to change down the road (think Microsoft's MFC, or their .NET, or - well, they make this mistake a lot). Instead, you should almost always think of extension through composition whenever possible. Inheritance should be used when you are thinking "Open / Closed Principle". You should have abstract base classes and dynamic polymorphism runtime through inherited class, each will full implementations. Hierarchies shouldn't be deep - almost always two levels. Only use more than two when you have different dynamic categories that go to a variety of functions that need that distinction for type safety. In those cases, use abstract bases until the leaf classes, which have the implementation.
Maybe many people here will not like this answer, but it is time for some heresy to be told and yes ... be told also that "the king is naked!"
All the motivation against the derivation are weak. Derivation is not different than composition. It's just a way to "put things together".
Composition puts things together giving them names, inheritance does it without giving explicit names.
If you need a vector that has the same interface and implementation of std::vector plus something more, you can:
use composition and rewrite all the embedded object function prototypes implementing function that delegates them (and if they are 10000... yes: be prepared to rewrite all those 10000) or...
inherit it and add just what you need (and ... just rewrite constructors, until C++ lawyers will decide to let them be inheritable as well: I still remember 10 year ago zealot discussion about "why ctors cannot call each other" and why it is a "bad bad bad thing" ... until C++11 permitted it and suddenly all those zealots shut up!) and let the new destructor be non-virtual as it was in the original one.
Just like for every class that has some virtual method and some not, you know you cannot pretend to invoke the non-virtual method of derived by addressing the base, the same applies for delete. There is no reason just for delete to pretend any particular special care.
A programmer who knows that whatever is not virtual isn't callable addressing the base, also knows not to use delete on your base after allocating your derived.
All the "avoid this", "don't do that", always sound as "moralization" of something that is natively agnostic. All the features of a language exist to solve some problem. The fact a given way to solve the problem is good or bad depends on the context, not on the feature itself.
If what you're doing needs to serve many containers, inheritance is probably not the way (you have to redo for all). If it is for a specific case ... inheritance is a way to compose. Forget OOP purisms: C++ is not a "pure OOP" language, and containers are not OOP at all.
Publicly inheriting is a problem for all the reasons others have stated, namely that your container can be upcasted to the base class which does not have a virtual destructor or virtual assignment operator, which can lead to slicing problems.
Privately inheriting, on the other hand, is less of an issue. Consider the following example:
#include <vector>
#include <iostream>
// private inheritance, nobody else knows about the inheritance, so nobody is upcasting my
// container to a std::vector
template <class T> class MyVector : private std::vector<T>
{
private:
// in case I changed to boost or something later, I don't have to update everything below
typedef std::vector<T> base_vector;
public:
typedef typename base_vector::size_type size_type;
typedef typename base_vector::iterator iterator;
typedef typename base_vector::const_iterator const_iterator;
using base_vector::operator[];
using base_vector::begin;
using base_vector::clear;
using base_vector::end;
using base_vector::erase;
using base_vector::push_back;
using base_vector::reserve;
using base_vector::resize;
using base_vector::size;
// custom extension
void reverse()
{
std::reverse(this->begin(), this->end());
}
void print_to_console()
{
for (auto it = this->begin(); it != this->end(); ++it)
{
std::cout << *it << '\n';
}
}
};
int main(int argc, char** argv)
{
MyVector<int> intArray;
intArray.resize(10);
for (int i = 0; i < 10; ++i)
{
intArray[i] = i + 1;
}
intArray.print_to_console();
intArray.reverse();
intArray.print_to_console();
for (auto it = intArray.begin(); it != intArray.end();)
{
it = intArray.erase(it);
}
intArray.print_to_console();
return 0;
}
OUTPUT:
1
2
3
4
5
6
7
8
9
10
10
9
8
7
6
5
4
3
2
1
Clean and simple, and gives you the freedom to extend std containers without much effort.
And if you think about doing something silly, like this:
std::vector<int>* stdVector = &intArray;
You get this:
error C2243: 'type cast': conversion from 'MyVector<int> *' to 'std::vector<T,std::allocator<_Ty>> *' exists, but is inaccessible
You should refrain from deriving publicly from standard contianers. You may choose between private inheritance and composition and it seems to me that all the general guidelines indicate that composition is better here since you don't override any function. Don't derive publicly form STL containers - there really isn't any need of it.
By the way, if you want to add a bunch of algorithms to the container, consider adding them as freestanding functions taking an iterator range.
The problem is that you, or someone else, might accidentally pass your extended class to a function expecting a reference to the base class. That will effectively (and silently!) slice off the extensions and create some hard to find bugs.
Having to write some forwarding functions seems like a small price to pay in comparison.
Because you can never guarantee that you haven't used them in a polymorphic way. You're begging for problems. Taking the effort to write a few functions is no big deal, and, well, even wanting to do this is dubious at best. What happened to encapsulation?
Most common reason to want to inherit from the containers is because you want to add some member function to the class. Since stdlib itself is not modifiable, inheritance is thought to be the substitute. This does not work however. It's better to do a free function that takes a vector as parameter:
void f(std::vector<int> &v) { ... }
IMHO, I don't find any harm in inheriting STL containers if they are used as functionality extensions. (That's why I asked this question. :) )
The potential problem can occur when you try to pass the pointer/reference of your custom container to a standard container.
template<typename T>
struct MyVector : std::vector<T> {};
std::vector<int>* p = new MyVector<int>;
//....
delete p; // oops "Undefined Behavior"; as vector::~vector() is not 'virtual'
Such problems can be avoided consciously, provided good programming practice is followed.
If I want to take extreme care then I can go upto this:
#include<vector>
template<typename T>
struct MyVector : std::vector<T> {};
#define vector DONT_USE
Which will disallow using vector entirely.

Simple design choice?

Say I have a class with a private data member n and a public get_n() function.
When overloading the output operator for example, I can either use get_n() or make it a friend and use n.
Is there a 'best' choice? And if so, why?
Or is the difference going to be optimized away?
Thanks.
Use get_n, since this is not a proper usage of friend. And if get_n is a simple return n, the compiler is most likely going to inline it automatically.
I will answer your question with a question:
Why did you create the public get_n() in the first place?
You've already gotten a lot of somewhat-conflicting answers, so what you undoubtedly need is one more that contradicts nearly all of them.
From an efficiency viewpoint, it's unlikely to make any difference. A function that just returns a value will undoubtedly be generated inline unless you specifically prohibit that from happening by turning off all optimization.
That leaves only a question of what's preferable from a design viewpoint. At least IMO, it's usually preferable to not have a get_n in the first place. Once you remove that design problem, the question you asked just disappears: since there is no get_n to start with, you can't write other code to depend upon it.
That does still leave a small question of how you should do things though. This (of course) leads to more questions. In particular, what sort of thing does n represent? I realize you're probably giving a hypothetical example, but a good design (in this case) depends on knowing a little more about what n is and how it's used, as well as the type of which n is a member, and how it is intended to be used as well.
If n is a member of a leaf class, from which you expect no derivation, then you should probably use a friend function that writes n out directly:
class whatever {
int n;
friend std::ostream &operator<<(std::ostream &os, whatever const &w) {
return os << w.n;
}
};
Simple, straightforward, and effective.
If, however, n is a member of something you expect to use (or be used) as a base class, then you usually want to use a "virtual virtual" function:
class whatever {
int n;
virtual std::ostream &write(std::ostream &os) {
return os << n;
}
friend std::ostream &operator<<(std::ostream &os, whatever const &w) {
return w.write(os);
}
};
Note, however, that this assumes you're interested in writing out an entire object, and it just happens that at least in the current implementation, that means writing out the value of n.
As to why you should do things this way, there are a few simple principles I think should be followed:
Either make something really private, or make it public. A private member with public get_n (and, as often as not, public set_n as well) may be required for JavaBeans (for one example) but is still a really bad idea, and shows a gross misunderstanding of object orientation or encapsulation, not to mention producing downright ugly code.
Tell, don't ask. A public get_n frequently means you end up with client code that does a read/modify/write cycle, with the object acting as dumb data container. It's generally preferable to convert that to a single operation in which the client code describes the desired result, and the object itself does the read/modify/write to achieve that result.
Minimize the interface. You should strive for each object to have the smallest interface possible without causing unduly pain for users. Eliminating a public function like get_n is nearly always a good thing in itself, independent of its being good for encapsulation.
Since others have commented about friend functions, I'll add my two cents worth on that subject as well. It's fairly frequent to hear comments to the effect that "friend should be avoided because it breaks encapsulation."
I must vehemently disagree, and further believe that anybody who thinks that still has some work to do in learning to think like a programmer. A programmer must think in terms of abstractions, and then implement those abstractions as reasonably as possible in the real world.
If an object supports input and/or output, then the input and output are parts of that object's interface, and whatever implements that interface is part of the object. The only other possibility is that the type of object does not support input and/or output.
The point here is pretty simple: at least to support the normal conventions, C++ inserters and extractors must be written as free (non-member) functions. Despite this, insertion and extraction are just as much a part of the class' interface as any other operations, and (therefore) the inserter/extractor are just as much a part of the class (as an abstraction) as anything else is.
I'd note for the record that this is part of why I prefer to implement the friend functions involved inside the class, as I've shown them above. From a logical viewpoint, they're part of the class, so making them look like part of the class is a good thing.
I'll repeat one last time for emphasis: Giving them access to class internals can't possibly break encapsulation, because in reality they're parts of the class -- and C++'s strange requirement that they be implemented as free functions does not change that fact by one, single, solitary iota.
In this case the best practice is for the class to implement a toString method, that the output operator uses to get a string representation. Since this is a member function, it can access all the data directly. It also has the added benefit that you can make this method vritual, so that subclasses can override it, and you only need a single output operator for the base class.
Can the operator be implemented without using friend? Yes- don't use friend. No- make friend.

I need some C++ guru's opinions on extending std::string

I've always wanted a bit more functionality in STL's string. Since subclassing STL types is a no no, mostly I've seen the recommended method of extension of these classes is just to write functions (not member functions) that take the type as the first argument.
I've never been thrilled with this solution. For one, it's not necessarily obvious where all such methods are in the code, for another, I just don't like the syntax. I want to use . when I call methods!
A while ago I came up with the following:
class StringBox
{
public:
StringBox( std::string& storage ) :
_storage( storage )
{
}
// Methods I wish std::string had...
void Format();
void Split();
double ToDouble();
void Join(); // etc...
private:
StringBox();
std::string& _storage;
};
Note that StringBox requires a reference to a std::string for construction... This puts some interesting limits on it's use (and I hope, means it doesn't contribute to the string class proliferation problem)... In my own code, I'm almost always just declaring it on the stack in a method, just to modify a std::string.
A use example might look like this:
string OperateOnString( float num, string a, string b )
{
string nameS;
StringBox name( nameS );
name.Format( "%f-%s-%s", num, a.c_str(), b.c_str() );
return nameS;
}
My question is: What do the C++ guru's of the StackOverflow community think of this method of STL extension?
I've never been thrilled with this solution. For one, it's not necessarily obvious where all such methods are in the code, for another, I just don't like the syntax. I want to use . when I call methods!
And I want to use $!---& when I call methods! Deal with it. If you're going to write C++ code, stick to C++ conventions. And a very important C++ convention is to prefer non-member functions when possible.
There is a reason C++ gurus recommend this:
It improves encapsulation, extensibility and reuse. (std::sort can work with all iterator pairs because it isn't a member of any single iterator or container class. And no matter how you extend std::string, you can not break it, as long as you stick to non-member functions. And even if you don't have access to, or aren't allowed to modify, the source code for a class, you can still extend it by defining nonmember functions)
Personally, I can't see the point in your code. Isn't this a lot simpler, more readable and shorter?
string OperateOnString( float num, string a, string b )
{
string nameS;
Format(nameS, "%f-%s-%s", num, a.c_str(), b.c_str() );
return nameS;
}
// or even better, if `Format` is made to return the string it creates, instead of taking it as a parameter
string OperateOnString( float num, string a, string b )
{
return Format("%f-%s-%s", num, a.c_str(), b.c_str() );
}
When in Rome, do as the Romans, as the saying goes. Especially when the Romans have good reasons to do as they do. And especially when your own way of doing it doesn't actually have a single advantage. It is more error-prone, confusing to people reading your code, non-idiomatic and it is just more lines of code to do the same thing.
As for your problem that it's hard to find the non-member functions that extend string, place them in a namespace if that's a concern. That's what they're for. Create a namespace StringUtil or something, and put them there.
As most of us "gurus" seem to favour the use of free functions, probably contained in a namespace, I think it safe to say that your solution will not be popular. I'm afraid I can't see one single advantage it has, and the fact that the class contains a reference is an invitation to that becoming a dangling reference.
I'll add a little something that hasn't already been posted. The Boost String Algorithms library has taken the free template function approach, and the string algorithms they provide are spectacularly re-usable for anything that looks like a string: std::string, char*, std::vector, iterator pairs... you name it! And they put them all neatly in the boost::algorithm namespace (I often use using namespace algo = boost::algorithm to make string manipulation code more terse).
So consider using free template functions for your string extensions, and look at Boost String Algorithms on how to make them "universal".
For safe printf-style formatting, check out Boost.Format. It can output to strings and streams.
I too wanted everything to be a member function, but I'm now starting to see the light. UML and doxygen are always pressuring me to put functions inside of classes, because I was brainwashed by the idea that C++ API == class hierarchy.
If the scope of the string isn't the same as the StringBox you can get segfaults:
StringBox foo() {
string s("abc");
return StringBox(s);
}
At least prevent object copying by declaring the assignment operator and copy ctor private:
class StringBox {
//...
private:
void operator=(const StringBox&);
StringBox(const StringBox&);
};
EDIT: regarding API, in order to prevent surprises I would make the StringBox own its copy of the string. I can think fo 2 ways to do this:
Copy the string to a member (not a reference), get the result later - also as a copy
Access your string through a reference-counting smart pointer like std::tr1::shared_ptr or boost:shared_ptr, to prevent extra copying
The problem with loose functions is that they're loose functions.
I would bet money that most of you have created a function that was already provided by the STL because you simply didn't know the STL function existed, or that it could do what you were trying to accomplish.
It's a fairly punishing design, especially for new users. (The STL gets new additions too, further adding to the problem.)
Google: C++ to string
How many results mention: std::to_string
I'm just as likely to find some ancient C method, or some homemade version, as I am to find the STL version of any given function.
I much prefer member methods because you don't have to struggle to find them, and you don't need to worry about finding old deprecated versions, etc,. (ie, string.SomeMethod, is pretty much guaranteed to be the method you should be using, and it gives you something concrete to Google for.)
C# style extension methods would be a good solution.
They're loose functions.
They show up as member functions via intellisense.
This should allow everyone to do exactly what they want.
It seems like it could be accomplished in the IDE itself, rather than requiring any language changes.
Basically, if the interpreter hits some call to a member that doesn't exist, it can check headers for matching loose functions, and dynamically fix it up before passing it on to the compiler.
Something similar could be done when it's loading up the intellisense data.
I have no idea how this could be worked for existing functions, no massive change like this should be taken lightly, but, for new functions using a new syntax, it shouldn't be a problem.
namespace StringExt
{
std::string MyFunc(this std::string source);
}
That can be used by itself, or as a member of std::string, and the IDE can handle all the grunt work.
Of course, this still leaves the problem of methods being spread out over various headers, which could be solved in various ways.
Some sort of extension header: string_ext which could include common methods.
Hmm....
That's a tougher issue to solve without causing issues...
If you want to extend the methods available to act on string, I would extend it by creating a class that has static methods that take the standard string as a parameter.
That way, people are free to use your utilities, but don't need to change the signatures of their functions to take a new class.
This breaks the object-oriented model a little, but makes the code much more robust - i.e. if you change your string class, then it doesn't have as much impact on other code.
Follow the recommended guidelines, they are there for a reason :)
The best way is to use templated free functions. The next best is private inheritance struct extended_str : private string, which happens to get easier in C++0x by the way as you can using constructors. Private inheritance is too much trouble and too risky just to add some algorithms. What you are doing is too risky for anything.
You've just introduced a nontrivial data structure to accomplish a change in code punctuation. You have to manually create and destroy a Box for each string, and you still need to distinguish your methods from the native ones. You will quickly get tired of this convention.

C++: Copy constructor: Use getters or access member vars directly?

I have a simple container class with a copy constructor.
Do you recommend using getters and setters, or accessing the member variables directly?
public Container
{
public:
Container() {}
Container(const Container& cont) //option 1
{
SetMyString(cont.GetMyString());
}
//OR
Container(const Container& cont) //option 2
{
m_str1 = cont.m_str1;
}
public string GetMyString() { return m_str1;}
public void SetMyString(string str) { m_str1 = str;}
private:
string m_str1;
}
In the example, all code is inline, but in our real code there is no inline code.
Update (29 Sept 09):
Some of these answers are well written however they seem to get missing the point of this question:
this is simple contrived example to discuss using getters/setters vs variables
initializer lists or private validator functions are not really part of this question. I'm wondering if either design will make the code easier to maintain and expand.
Some ppl are focusing on the string in this example however it is just an example, imagine it is a different object instead.
I'm not concerned about performance. we're not programming on the PDP-11
EDIT: Answering the edited question :)
this is simple contrived example to
discuss using getters/setters vs
variables
If you have a simple collection of variables, that don't need any kind of validation, nor additional processing then you might consider using a POD instead. From Stroustrup's FAQ:
A well-designed class presents a clean
and simple interface to its users,
hiding its representation and saving
its users from having to know about
that representation. If the
representation shouldn't be hidden -
say, because users should be able to
change any data member any way they
like - you can think of that class as
"just a plain old data structure"
In short, this is not JAVA. you shouldn't write plain getters/setters because they are as bad as exposing the variables them selves.
initializer lists or private validator functions are not really
part of this question. I'm wondering
if either design will make the code
easier to maintain and expand.
If you are copying another object's variables, then the source object should be in a valid state. How did the ill formed source object got constructed in the first place?! Shouldn't constructors do the job of validation? aren't the modifying member functions responsible of maintaining the class invariant by validating input? Why would you validate a "valid" object in a copy constructor?
I'm not concerned about performance. we're not programming on the PDP-11
This is about the most elegant style, though in C++ the most elegant code has the best performance characteristics usually.
You should use an initializer list. In your code, m_str1 is default constructed then assigned a new value. Your code could be something like this:
class Container
{
public:
Container() {}
Container(const Container& cont) : m_str1(cont.m_str1)
{ }
string GetMyString() { return m_str1;}
void SetMyString(string str) { m_str1 = str;}
private:
string m_str1;
};
#cbrulak You shouldn't IMO validate cont.m_str1 in the copy constructor. What I do, is to validate things in constructors. Validation in copy constructor means you you are copying an ill formed object in the first place, for example:
Container(const string& str) : m_str1(str)
{
if(!valid(m_str1)) // valid() is a function to check your input
{
// throw an exception!
}
}
You should use an initializer list, and then the question becomes meaningless, as in:
Container(const Container& rhs)
: m_str1(rhs.m_str1)
{}
There's a great section in Matthew Wilson's Imperfect C++ that explains all about Member Initializer Lists, and about how you can use them in combination with const and/or references to make your code safer.
Edit: an example showing validation and const:
class Container
{
public:
Container(const string& str)
: m_str1(validate_string(str))
{}
private:
static const string& validate_string(const string& str)
{
if(str.empty())
{
throw runtime_error("invalid argument");
}
return str;
}
private:
const string m_str1;
};
As it's written right now (with no qualification of the input or output) your getter and setter (accessor and mutator, if you prefer) are accomplishing absolutely nothing, so you might as well just make the string public and be done with it.
If the real code really does qualify the string, then chances are pretty good that what you're dealing with isn't properly a string at all -- instead, it's just something that looks a lot like a string. What you're really doing in this case is abusing the type system, sort of exposing a string, when the real type is only something a bit like a string. You're then providing the setter to try to enforce whatever restrictions the real type has compared to a real string.
When you look at it from that direction, the answer becomes fairly obvious: rather than a string, with a setter to make the string act like some other (more restricted) type, what you should be doing instead is defining an actual class for the type you really want. Having defined that class correctly, you make an instance of it public. If (as seems to be the case here) it's reasonable to assign it a value that starts out as a string, then that class should contain an assignment operator that takes a string as an argument. If (as also seems to be the case here) it's reasonable to convert that type to a string under some circumstances, it can also include cast operator that produces a string as the result.
This gives a real improvement over using a setter and getter in a surrounding class. First and foremost, when you put those in a surrounding class, it's easy for code inside that class to bypass the getter/setter, losing enforcement of whatever the setter was supposed to enforce. Second, it maintains a normal-looking notation. Using a getter and a setter forces you to write code that's just plain ugly and hard to read.
One of the major strengths of a string class in C++ is using operator overloading so you can replace something like:
strcpy(strcat(filename, ".ext"));
with:
filename += ".ext";
to improve readability. But look what happens if that string is part of a class that forces us to go through a getter and setter:
some_object.setfilename(some_object.getfilename()+".ext");
If anything, the C code is actually more readable than this mess. On the other hand, consider what happens if we do the job right, with a public object of a class that defines an operator string and operator=:
some_object.filename += ".ext";
Nice, simple and readable, just like it should be. Better still, if we need to enforce something about the string, we can inspect only that small class, we really only have to look one or two specific, well-known places (operator=, possibly a ctor or two for that class) to know that it's always enforced -- a totally different story from when we're using a setter to try to do the job.
Do you anticipate how the string is returned, eg. white space trimmed, null checked, etc.? Same with SetMyString(), if the answer is yes, you are better off with access methods since you don't have to change your code in zillion places but just modify those getter and setter methods.
Ask yourself what the costs and benefits are.
Cost: higher runtime overhead. Calling virtual functions in ctors is a bad idea, but setters and getters are unlikely to be virtual.
Benefits: if the setter/getter does something complicated, you're not repeating code; if it does something unintuitive, you're not forgetting to do that.
The cost/benefit ratio will differ for different classes. Once you're ascertained that ratio, use your judgment. For immutable classes, of course, you don't have setters, and you don't need getters (as const members and references can be public as no one can change/reseat them).
There's no silver bullet as how to write the copy constructor.
If your class only has members which provide a copy constructor that creates
instances which do not share state (or at least do not appear to do so) using an initializer list is a good way.
Otherwise you'll have to actually think.
struct alpha {
beta* m_beta;
alpha() : m_beta(new beta()) {}
~alpha() { delete m_beta; }
alpha(const alpha& a) {
// need to copy? or do you have a shared state? copy on write?
m_beta = new beta(*a.m_beta);
// wrong
m_beta = a.m_beta;
}
Note that you can get around the potential segfault by using smart_ptr - but you can have a lot of fun debugging the resulting bugs.
Of course it can get even funnier.
Members which are created on demand.
new beta(a.beta) is wrong in case you somehow introduce polymorphism.
... a screw the otherwise - please always think when writing a copy constructor.
Why do you need getters and setters at all?
Simple :) - They preserve invariants - i.e. guarantees your class makes, such as "MyString always has an even number of characters".
If implemented as intended, your object is always in a valid state - so a memberwise copy can very well copy the members directly without fear of breaking any guarantee. There is no advantage of passing already validated state through another round of state validation.
As AraK said, the best would be using an initializer list.
Not so simple (1):
Another reason to use getters/setters is not relying on implementation details. That's a strange idea for a copy CTor, when changing such implementation details you almost always need to adjust CDA anyway.
Not so simple (2):
To prove me wrong, you can construct invariants that are dependent on the instance itself, or another external factor. One (very contrieved) example: "if the number of instances is even, the string length is even, otherwise it's odd." In that case, the copy CTor would have to throw, or adjust the string. In such a case it might help to use setters/getters - but that's not the general cas. You shouldn't derive general rules from oddities.
I prefer using an interface for outer classes to access the data, in case you want to change the way it's retrieved. However, when you're within the scope of the class and want to replicate the internal state of the copied value, I'd go with data members directly.
Not to mention that you'll probably save a few function calls if the getter are not inlined.
If your getters are (inline and) not virtual, there's no pluses nor minuses in using them wrt direct member access -- it just looks goofy to me in terms of style, but, no big deal either way.
If your getters are virtual, then there is overhead... but nevertheless that's exactly when you DO want to call them, just in case they're overridden in a subclass!-)
There is a simple test that works for many design questions, this one included: add side-effects and see what breaks.
Suppose setter not only assigns a value, but also writes audit record, logs a message or raises an event. Do you want this happen for every property when copying object? Probably not - so calling setters in constructor is logically wrong (even if setters are in fact just assignments).
Although I agree with other posters that there are many entry-level C++ "no-no's" in your sample, putting that to the side and answering your question directly:
In practice, I tend to make many but not all of my member fields* public to start with, and then move them to get/set when needed.
Now, I will be the first to say that this is not necessarily a recommended practice, and many practitioners will abhor this and say that every field should have setters/getters.
Maybe. But I find that in practice this isn't always necessary. Granted, it causes pain later when I change a field from public to a getter, and sometimes when I know what usage a class will have, I give it set/get and make the field protected or private from the start.
YMMV
RF
you call fields "variables" - I encourage you to use that term only for local variables within a function/method

Member functions for derived information in a class

While designing an interface for a class I normally get caught in two minds whether should I provide member functions which can be calculated / derived by using combinations of other member functions. For example:
class DocContainer
{
public:
Doc* getDoc(int index) const;
bool isDocSelected(Doc*) const;
int getDocCount() const;
//Should this method be here???
//This method returns the selected documents in the contrainer (in selectedDocs_out)
void getSelectedDocs(std::vector<Doc*>& selectedDocs_out) const;
};
Should I provide this as a class member function or probably a namespace where I can define this method? Which one is preferred?
In general, you should probably prefer free functions. Think about it from an OOP perspective.
If the function does not need access to any private members, then why should it be given access to them? That's not good for encapsulation. It means more code that may potentially fail when the internals of the class is modified.
It also limits the possible amount of code reuse.
If you wrote the function as something like this:
template <typename T>
bool getSelectedDocs(T& container, std::vector<Doc*>&);
Then the same implementation of getSelectedDocs will work for any class that exposes the required functions, not just your DocContainer.
Of course, if you don't like templates, an interface could be used, and then it'd still work for any class that implemented this interface.
On the other hand, if it is a member function, then it'll only work for this particular class (and possibly derived classes).
The C++ standard library follows the same approach. Consider std::find, for example, which is made a free function for this precise reason. It doesn't need to know the internals of the class it's searching in. It just needs some implementation that fulfills its requirements. Which means that the same find() implementation can work on any container, in the standard library or elsewhere.
Scott Meyers argues for the same thing.
If you don't like it cluttering up your main namespace, you can of course put it into a separate namespace with functionality for this particular class.
I think its fine to have getSelectedDocs as a member function. It's a perfectly reasonable operation for a DocContainer, so makes sense as a member. Member functions should be there to make the class useful. They don't need to satisfy some sort of minimality requirement.
One disadvantage to moving it outside the class is that people will have to look in two places when the try to figure out how to use a DocContainer: they need to look in the class and also in the utility namespace.
The STL has basically aimed for small interfaces, so in your case, if and only if getSelectedDocs can be implemented more efficiently than a combination of isDocSelected and getDoc it would be implemented as a member function.
This technique may not be applicable anywhere but it's a good rule of thumbs to prevent clutter in interfaces.
I agree with the answers from Konrad and jalf. Unless there is a significant benefit from having "getSelectedDocs" then it clutters the interface of DocContainer.
Adding this member triggers my smelly code sensor. DocContainer is obviously a container so why not use iterators to scan over individual documents?
class DocContainer
{
public:
iterator begin ();
iterator end ();
// ...
bool isDocSelected (Doc *) const;
};
Then, use a functor that creates the vector of documents as it needs to:
typedef std::vector <Doc*> DocVector;
class IsDocSelected {
public:
IsDocSelected (DocContainer const & docs, DocVector & results)
: docs (docs)
, results (results)
{}
void operator()(Doc & doc) const
{
if (docs.isDocSelected (&doc))
{
results.push_back (&doc);
}
}
private:
DocContainer const & docs;
DocVector & results;
};
void foo (DocContainer & docs)
{
DocVector results;
std :: for_each (docs.begin ()
, docs.end ()
, IsDocSelected (docs, results));
}
This is a bit more verbose (at least until we have lambdas), but an advantage to this kind of approach is that the specific type of filtering is not coupled with the DocContainer class. In the future, if you need a new list of documents that are "NotSelected" there is no need to change the interface to DocContainer, you just write a new "IsDocNotSelected" class.
The answer is proabably "it depends"...
If the class is part of a public interface to a library that will be used by many different callers then there's a good argument for providing a multitude of functionality to make it easy to use, including some duplication and/or crossover. However, if the class is only being used by a single upstream caller then it probably doesn't make sense to provide multiple ways to achieve the same thing. Remember that all the code in the interface has to be tested and documented, so there is always a cost to adding that one last bit of functionality.
I think this is perfectly valid if the method:
fits in the class responsibilities
is not too specific to a small part of the class clients (like at least 20%)
This is especially true if the method contains complex logic/computation that would be more expensive to maintain in many places than only in the class.