I am making a project that will use mathematic computations a lot. Also I want to be able to simply change the implementation of real numbers. Let's say between float, double, my own implementation and gmplib float types.
So far I thouht of two ways:
I create a class "Number" which will interface with the rest of the program.
I typedef the arithmetic type and write global functions to interface with the rest of the program.
The first choice seems to be more elegant, but the second seems to have less overhead. Is there a third better choice? Also I am worried by the elementary mathematical functions such as sine, cosine, exp... I figured out that to make the switching easy, I should implement them as templates, but my implementations are hopelessly slow.
I am generally new to programming in C++. I was brought up in the comfortable Matlab and Mathematica environments, where I did not have to worry about such things.
You'll want to use templates with constraints to avoid re-implementing things.
For instance, say you want to use sin in your program differently for float and double. You can overload based on type and create specialized templates.
template<class T> T MySin(const T& f) {
return genericSin(f);
}
template<> float MySin<float>(float f) {
return sinf(f);
}
template<> double MySin<double>(double d) {
return sin(d);
}
For functions. The syntax is similar when partially specializing a Math class if you want to go the OO route. This will enable you to call your routines with any type and have the most specialized and most efficient routine called.
Templates are the way I have done this. it makes it easy to specialize what must be specialized, and provides a good way to reuse implementations when it applies to multiple types.
The number type can be done, but it's actually not simple to do right and introduces some restrictions (compared to templates).
Multiple types are just hopelessly complex, if you want something even close to fast, accurate, and simple to maintain. You'd likely end up using templates to implement these correctly if you were to create a global typedef.
Templates provide all the power, control, and flexibility you would need, and they will be faster than the alternatives posted (technically, #2 could be as fast if you resorted to... templates).
a template class like real numbers should work for you. in that you can overload the required functions and if required use template specializations.
in order to improve efficiency use STL algorithms instead of hand written loops.
good luck
Both alternatives are equivalent in terms of encapsulation: There will be a single point in your program where you'll have to change the number type, and this one change will affect your whole program. If presented with those two alternatives, choose the typedef; it is less elegant (=> simpler, and simpler is better) and has the same power.
When you get more comfortable with C++, templating your functions will be a better fit, since the determination of the number type can be made locally instead of globally. With templates, you determine the number type at the instantiation point (most likely the call site), giving much greater flexibility. However, there is a number of pitfalls in templates, and I'd recommend to you that you get a little more experience with C++ first and then start templating.
Related
Ive spent the day reading notes and watching a video on boost::fusion and I really don't get some aspects to it.
Take for example, the boost::fusion::has_key<S> function. What is the purpose of having this in boost::fusion? Is the idea that we just try and move as much programming as possible to happen at compile-time? So pretty much any boost::fusion function is the same as the run-time version, except it now evaluates at compile time? (and we assume doing more at compile-time is good?).
Related to boost::fusion, i'm also a bit confused why metafunctions always return types. Why is this?
Another way to look at boost::fusion is to think of it as "poor man introspection" library. The original motivation for boost::fusion comes from the direction of boost::spirit parser/generator framework, in particular the need to support what is called "parser attributes".
Imagine, you've got a CSV string to parse:
aaaa, 1.1
The type, this string parses into, can be described as "tuple of string and double". We can define such tuples in "plain" C++, either with old school structs (struct { string a; double b; } or newer tuple<string, double>). The only thing we miss is some sort of adapter, which will allow to pass tuples (and some other types) of arbitrary composition to a unified parser interface and expect it to make sense of it without passing any out of band information (such as string parsing templates used by scanf).
That's where boost::fusion comes into play. The most straightforward way to construct a "fusion sequence" is to adapt a normal struct:
struct a {
string s;
double d;
};
BOOST_FUSION_ADAPT_STRUCT(a, (string, s)(double, d))
The "ADAPT_STRUCT" macro adds the necessary information for parser framework (in this example) to be able to "iterate" over members of struct a to the tune of the following questions:
I just parsed a string. Can I assign it to first member of struct a?
I just parsed a double. Can I assign it to second member of struct a?
Are there any other members in struct a or should I stop parsing?
Obviously, this basic example can be further extended (and boost::fusion supplies the capability) to address much more complex cases:
Variants - let's say parser can encounter either sting or double and wants to assign it to the right member of struct a. BOOST_FUSION_ADAPT_ASSOC_STRUCT comes to the rescue (now our parser can ask questions like "which member of struct a is of type double?").
Transformations - our parser can be designed to accept certain types as parameters but the rest of the programs had changed quite a bit. Yet, fusion metafunctions can be conveniently used to adapt new types to old realities (or vice versa).
The rest of boost::fusion functionality naturally follows from the above basics. fusion really shines when there's a need for conversion (in either direction) of "loose IO data" to strongly typed/structured data C++ programs operate upon (if efficiency is of concern). It is the enabling factor behind spirit::qi and spirit::karma being such an efficient (probably the fastest) I/O frameworks .
Fusion is there as a bridge between compile-time and run-time containers and algorithms. You may or may not want to move some of your processing to compile-time, but if you do want to then Fusion might help. I don't think it has a specific manifesto to move as much as possible to compile-time, although I may be wrong.
Meta-functions return types because template meta-programming wasn't invented on purpose. It was discovered more-or-less by accident that C++ templates can be used as a compile-time programming language. A meta-function is a mapping from template arguments to instantiations of a template. As of C++03 there were are two kinds of template (class- and function-), therefore a meta-function has to "return" either a class or a function. Classes are more useful than functions, since you can put values etc. in their static data members.
C++11 adds another kind of template (for typedefs), but that is kind of irrelevant to meta-programming. More importantly for compile-time programming, C++11 adds constexpr functions. They're properly designed for the purpose and they return values just like normal functions. Of course, their input is not a type, so they can't be mappings from types to something else in the way that templates can. So in that sense they lack the "meta-" part of meta-programming. They're "just" compile-time evaluation of normal C++ functions, not meta-functions.
For some time I've been designing my class interfaces to be minimal, preferring namespace-wrapped non-member functions over member functions. Essentially following Scott Meyer's advice in the article How Non-Member Functions Improve Encapsulation.
I've been doing this with good effect in a few small scale projects, but I'm wondering how well it works on a larger scale. Are there any large, well regarded open-source C++ projects that I can take a look at and perhaps reference where this advice is strongly followed?
Update: Thanks for all the input, but I'm not really interested in opinion so much as finding out how well it works in practice on a larger scale. Nick's answer is closest in this regard, but I'd like to be able to see the code. Any sort of detailed description of practical experiences (positives, negatives, practical considerations, etc) would be acceptable as well.
I do this quite a bit on the project I work on; the largest of which at my current company is around 2M lines, but it's not open source, so I can't provide it as a reference. However, I will say that I agree with the advice, generally speaking. The more you can separate the functionality which is not strictly contained to just one object from that object, the better your design will be.
By way of an example, consider the classic polymorphism example: a Shape base class with subclasses, and a virtual Draw() function. In the real world, Draw() would need to take some drawing context, and potentially be aware of the state of other things being drawn, or the application in general. Once you put all that into each subclass implementation of Draw(), you're likely to have some code overlap, or most of your actual Draw() logic will be in the base class, or somewhere else. Then consider that if you want to re-use some of that code, you'll need to provide more entry points into the interface, and possibly pollute the functions with other code not related to drawing shapes (eg: multi-shape drawing correlation logic). Before long, it'll be a mess, and you'll wish you had a draw function which took a Shape (and context, and other data) instead, and Shape just had functions/data which were entirely encapsulated and not using or referencing external objects.
Anyway, that's my experience/advice, for what it's worth.
I'd argue that the benefit of non-member functions increases as the size of the project increases. The standard library containers, iterators, and algorithms library are proof of this.
If you can decouple algorithms from data structures (or, to phrase it another way, if you can decouple what you do with objects from how their internal state is manipulated), you can decrease coupling between your classes and take greater advantage of generic code.
Scott Meyers isn't the only author who has argued in favor of this principle; Herb Sutter has too, especially in Monoliths Unstrung, which ends with the guideline:
Where possible, prefer writing functions as nonmember nonfriends.
I think one of the best examples of an unneccessary member function from that article is std::basic_string::find; there is no reason for it to exist, really, as std::find provides exactly the same functionality.
OpenCV library does this. They have a cv::Mat class that presents a 3D matrix (or images). Then they have all the other functions in the cv namespace.
OpenCV library is huge and is widely regarded in its field.
One practical advantage of writing functions as nonmember nonfriends is that doing so can significantly reduce the time it takes to thoroughly test and verify the code.
Consider, for example, the sequence container member functions insert and push_back. There are at least two approaches to implementing push_back:
It can simply call insert (it's behavior is defined in terms of insert anyway)
It can do all the work that insert would do (possibly calling private helper functions) without actually calling insert
Obviously, when implementing a sequence container, you probably want to use the first approach. push_back is just a special form of insert and (to the best of my knowledge) you can't really get any performance benefit by implementing push_back some other way (at least not for list, deque, or vector).
However, to thoroughly test such a container, you have to test push_back separately: since push_back is a member function, it can modify any and all of the internal state of the container. From a testing standpoint, you should (must?) assume that push_back is implemented using the second approach because it is possible that it could be implemented using the second approach. There is no guarantee that it is implemented in terms of insert.
If push_back is implemented as a nonmember nonfriend, it can't touch any of the internal state of the container; it must use the first approach. When you write tests for it, you know that it can't break the internal state of the container (assuming the actual container member functions are implemented correctly). You can use that knowledge to significantly reduce the number of tests that you need to write to fully exercise the code.
(I don't have time to write this up nicely, the following's a 5 minute brain dump which doubtless can be ripped apart at various trival levels, but please address the concepts and general thrust.)
I have considerable sympathy for the position taken by Jonathan Grynspan, but want to say a bit more about it than can reasonably be done in comments.
First - a "well said" to Alf Steinbach, who chipped in with "It's only over-simplified caricatures of their viewpoints that might seem to be in conflict. For what it's worth I don't agree with Scott Meyers on this matter; as I see it he's over-generalizing here, or he was."
Scott, Herb etc. were making these points when few people understood the trade-offs or alternatives, and they did so with disproportionate strength. Some nagging hassles people had during evolution of code were analysed and a new design approach addressing those issues was rationally derived. Let's return to the question of whether there were downsides later, but first - worth saying that the pain in question was typically small and infrequent: non-member functions are just one small aspect of designing reusable code, and in enterprise scale systems I've worked on simply writing the same kind of code you'd have put into a member function as a non-member is rarely enough to make the non-members reusable. It's pretty rare for them to even express algorithms that are both complex enough to be worth reusing and yet not tightly bound to the specific of the class they were designed for, that being weird enough that it's practically inconceivable some other class will happen along supporting the same operations and semantics. Often, you also need to template arguments, or introduce a base class to abstract the set of operations required. Both have significant implications in terms of performance, being inline vs out-of-line, client-code recompilation.
That said, there's often less code changes and impact study required when changing implementation if operations have been implementing in terms of a public interface, and being a non-friend non-member systematically enforces that. Occasionally though, it makes the initial implementation more verbose or in some other way less desirable and maintainble.
But, as a litmus test - how many of these non-member functions sit in the same header as the only class for which they're currently applicable? How many want to abstract their arguments via templates (which means inlining, compilation dependencies) or base classes (virtual function overheads) to allow reuse? Both discourage people from seeing them as reusable, but when not the case, the operations available on a class are delocalised, which can frustrate developers perception of a system: the develop often has to work out for themselves the rather disappointing fact that - "oh - that will only work for class X".
Bottom line: most member functions aren't potentially reusable. Much corporate code isn't broken into clean algorithm versus data with potential for reuse of the former. That kind of division just isn't required or useful or conceivably useful 20 years down the road. It's much the same as get/set methods - they're needed at certain API boundaries, but can constitute needless verbosity when ownership and use of the code is localised.
Personally, I don't have an all or nothing approach to this, but decide what to make a member function or non-member based on whether there's any likely benefit to either, potential reusability versus locality of interface.
I also do this alot, where it seems to make sense, and it causes absolutely no problems with scaling. (although my current project is only 40000 LOC) In fact, I think it makes the code more scalable - it slims down classes, reduces dependencies.
It sometimes requires you to refactor your functions to make them independent of members of the class - and thereby often creating a library of more general helper functions, which you can easly reuse elsewhere. I'd also mention that one of the common problems with many large projects is the bloating of classes - and I think preferring non-member, non-friend functions also helps here.
Prefer non-member non-friend functions for encapsulation UNLESS you want implicit conversions to work for class templates non-member functions (in which case you better make them friend functions):
That is, if you have a class template type<T>:
template<class T>
struct type {
void friend foo(type<T> a) {}
};
and a type implicitly convertible to type<T>, e.g.:
template<class T>
struct convertible_to_type {
operator type<T>() { }
};
The following works as expected:
auto t = convertible_to_type<int>{};
foo(t); // t is converted to type<int>
However, if you make foo a non-friend function:
template<class T>
void foo(type<T> a) {}
then the following doesn't work:
auto t = convertible_to_type<int>{};
foo(t); // FAILS: cannot deduce type T for type
Since you cannot deduce T then the function foo is removed from the overload resolution set, that is: no function is found, which means that the implicit conversion does not trigger.
template<class T>
void swap(T &a, T &b)
{
T t;
t = a;
a = b;
b = t;
}
replace
void swap(int &a, int &b)
{
int t;
t = a;
a = b;
b = t;
}
This is the simplest example I could come up with,but there should be many other complicated functions.Should I make all methods I write templated if possible?
Any disadvantages to do this?
thanks.
Genericity has the advantage of being reusable. However, write things generic, only if:
It doesn't take much more time to do that, than do it non-generic
It doesn't complicate the code more than a non-generic solution
You know will benefit from it later
However, know your standard library. The case you presented is already in STL as std::swap.
Also, remember that when writing generically using templates, you can optimize special cases by using template specialization. However, always to it when it's needed for performance, not as you write it.
Also, note that you have the question of run-time and compile-time performance here. Template-based solutions increase compile-time. Inline solutions can but not must decrease run-time.
`Cause "Premature optimization and genericity is the root of all evil". And you can quote me on that -_-.
Reusable code is reusable only if you actually reuse it. so write the function naturally in the first instance. If a bit later you come across a situation where the code could be reused with a little tweak, go back and refactor it, It is at the refactoring stage you should consider writing template functions.
The simplest answer to your question is what many people smarter than myself have been saying for years:
Never write more than the minimum you can get away with.
Make them as generic as you can trivially make them. If it's truly trivial (such as the above example) then it takes no extra work, and might save you some work in the future
The first time you write swap you shouldn't
The second time it might be tempting but sometime you can get away without making the whole thing a mess
The third time it should be clear that you must. However depending on how many places you've used one and two it might be time consuming so the second time should be a good decision
There are disadvantages to using templates all the time. It (can) greatly increase the compilation time of your program and can make compilation errors more difficult to understand.
As taldor said, don't make your functions more generic than they need to be.
You may take a look at the function parameters and the way they are used. If all operations are done through overloaded operators the function may be very generic and a good candidate to become a template. Otherwise, the presence of very specialized class types and functions calls may make generic reusability very problematic and any eventual flexibility should be rather realized through polymorphism.
A few thoughts:
Know the STL. There is std::swap already. Instead of spending your time making everything as generic as possible, spend your time becoming more familiar with the STL.
Don't do it till you need it: "Always implement things when you actually need them, never when you just foresee that you need them."---Ron Jeffries. If you don't actually reuse the code you didn't write reusable code, you wrote unnecessary code. Unnecessary code is expensive to develop, expensive to test, and expensive to maintain. Don't forget opportunity cost!
Keep things simple: "Make everything as simple as possible, but not simpler."---Albert Einstein. This is KISS.
One could break the question into two: how to read and to write templated code.
It is very easy to say, "it you want an array of doubles, write std::vector<double>", but it won't teach them how the templates work.
I'd probably try to demonstrate the power of templates, by demonstrating the annoyance of not using them.
A good demonstration would be to write something simple like a stack of doubles (hand-written, not STL), with methods push, pop, and foldTopTwo, which pops off and adds together the top two values in the stack, and pushes the new value back on.
Then tell them to do the same for ints (or whatever, just some different numeric type).
Then show them how, by writing this stack as a template, you can significantly reduce the number of lines of code, and all of that horrible duplication.
There is a saying: "If you can't explain it, you don't understand it."
You can break it down further: How to write code that uses templated code, and how to write code that provides a templated service to others.
The basic explanation is that templates generated code based on a template. That is the source of the term "meta programming". It is programming how programming should be done.
The essential complexity of a vector is not that it is a vector of doubles (or type T), but that it is a vector. The basic structure is the same and templates separate that which is consistent from that which is not.
Further explanation depends on how much of that makes sense to you!
IMHO it is best to explain them as (very) fancy macros. They just work at higher level than C-style text substitution macros.
I found it very instructive to look at duck-typed languages. It doesn't matter, there, what type of argument you give a function, as long as they offer the right interface.
Templates allows to do the same thing: you can take any type, as long as the right interface is present. The additional benefit over duck-typing is, that the interface is checked at compile-time.
Present them as advanced macros. It's a programming language on its own that is executed during compliation.
I would get them to implement something themselves, then experiment with different variations until they understand it. Learning by doing is almost always the better option with programming.
For example, get them to make a template which compares two values and returns the higher one. Then have them see has passing ints or doubles or whatever still allows it to work. Then get them to tweak the the code / copy it and have it return the minimum value. Again, experiment with variations - will the template allow them to pass an int and a double, or will it complain?
From there, you can have them pass in arrays of whatever type (int, double etc), and have it sort the array from highest to lowest, again encouraging experimentation. From there, start to move into templated class definitions, using the same kind of ideas but on a larger scale. This is pretty much how I learnt about templates, ending up with complex array manipulation classes for generic types.
When I was teaching myself C++ I used this site a lot. It explains templates in depth and very well. I would recommend having them read that and try implementing something simple.
For a shorter explanation: Templates are frameworks for complicated constructs that act on data without having to know what that data is. Give them some examples of a simple template (like a linked-list) and walk through how the template is used to generate the final class.
You can tell that a template is a half-written source with parameters to be filled while instatiating the template.
I have a char (ie. byte) buffer that I'm sending over the network. At some point in the future I might want to switch the buffer to a different type like unsigned char or short. I've been thinking about doing something like this:
typedef char bufferElementType;
And whenever I do anything with a buffer element I declare it as bufferElementType rather than char. That way I could switch to another type by changing this typedef (of course it wouldn't be that simple, but it would at least be easy to identify the places that need to be modified... there'll be a bufferElementType nearby).
Is this a valid / good use of typedef? Is it not worth the trouble? Is it going to give me a headache at some point in the future? Is it going to make maintainance programmers hate me?
I've read through When Should I Use Typedef In C++, but no one really covered this.
It is a great (and normal) usage. You have to be careful, though, that, for example, the type you select meet the same signed/unsigned criteria, or that they respond similarly to operators. Then it would be easier to change the type afterwards.
Another option is to use templates to avoid fixing the type till the moment you're compiling. A class that is defined as:
template <typename CharType>
class Whatever
{
CharType aChar;
...
};
is able to work with any char type you select, while it responds to all the operators in the same way.
Another advantage of typedefs is that, if used wisely, they can increase readability. As a really dumb example, a Meter and a Degree can both be doubles, but you'd like to differentiate between them. Using a typedef is onc quick & easy solution to make errors more visible.
Note: a more robust solution to the above example would have been to create different types for a meter and a degree. Thus, the compiler can enforce things itself. This requires a bit of work, which doesn't always pay off, however. Using typedefs is a quick & easy way to make errors visible, as described in the article linked above.
Yes, this is the perfect usage for typedef, at least in C.
For C++ it may be argued that templates are a better idea (as Diego Sevilla has suggested) but they have their drawbacks. (Extra work if everything using the data type is not already wrapped in a few classes, slower compilation times and a more complex source file structure, etc.)
It also makes sense to combine the two approaches, that is, give a typedef name to a template parameter.
Note that as you're sending data over a network, char and other integer types may not be interchangeable (e.g. due to endian-ness). In that case, using a templated class with specialized functions might make more sense. (send<char> sends the byte, send<short> converts it to network byte order first)
Yet another solution would be to create a "BufferElementType" class with helper methods (convertToNetworkOrderBytes()) but I'll bet that would be an overkill for you.