Using std::tuple correctly? - c++

I recently learned how to use std::tuple (I had never needed it before then), and was curious as to when it is the proper time to use it over a container? I only ask because std::tuple doesnt feel like a container, since it requires things such as std::get and std::make_tuple.

std::tuples contain a fixed number of elements of varying types. You use it when you want to group together a few objects of different types. I would use it over a struct when the grouping doesn't have any particularly useful semantics. For example, a std::tuple is useful when a function needs to return multiple objects. A container, on the other hand, contains multiple elements of the same type (perhaps a varying amount).

It's more of a replacement for a struct than for a container. However, in most cases you are better off actually defining a struct for readability reasons. Tuples are best used in templates with variable number of arguments.

Related

Generic programming with ranges/views not templates?

I'd like to have function that accepts any container of a fixed type. For a example a function that will accept both std::array<float,1> and std::array<float,2>.
I thought this would be possible with ranges but I'm realizing my understanding is quite superficial.
I this possible without templates?
Edit: Can we define a type using the ranges library that will do the equivalent of span but will work for non-contiguous containers? Maybe I didn't phrase my question right, I probably meant view rather than container.
For contiguous ranges, you might use std::span (C++20):
void foo(std::span<float>)
{
// ...
}

Method operating on container: hardcode the container type, or use generic template iterators?

I have code where, conceptually, my input is some container of Foo objects. The code "processes" these objects one by one, and the desired result is to fill up a container of FooProduct result objects.
I only need a single pass through the input container. The "processing" is stateful (this isn't an std::transform()) and the number of result objects is independent of the number of input objects.
Offhand, I could see two obvious ways to define the API here.
The easiest way to do this is to hardcode a specific type of container. For example, I could decide I'm expecting vector parameters, e.g.:
void ProcessContainerOfFoos(const std::vector<Foo>& in, std::vector<FooProduct>&out);
But, I don't really have any reason to limit client code to a particular type of container. Instead of constraining the parameter types specifically to vector, I could make the method generic and use iterators as template parameters:
/**
* #tparam Foo_InputIterator_T An input iterator giving objects of type Foo.
* #tparam FooProduct_OutputIterator_T An output iterator writing objects
* of type FooProduct.
*/
template<typename Foo_InputIterator_T, typename FooProduct_OutputIterator_T >
void ProcessContainerOfFoos(Foo_InputIterator_T first, Foo_InputIterator_T last,
FooProduct_OutputIterator_T out);
I'm debating between these two formulations.
Considerations
To me, the first code seems to me to be "easier" and the second seems "more correct":
Non-template types make the signature clearer; I don't need to explain in the documentation what types to use and what the constraints on the template parameter are.
Without templates I can hide the implementation in the .cpp file; with templates I'll need to expose the implementation in a header file, forcing client code to include anything I need for the actual processing logic.
The templated version feels like it expresses my intention more clearly, because I'd rather be indifferent to what container type is used.
The templated version is more flexible and testable - for example, in my code I might be using some custom data structure MySuperEfficientVector , but I'd still be able to test MyFooProcessor without any dependency on the custom class.
Beyond subjective choice given these considerations, is there a major reason to choose one of these over the other? Likewise, is there a better way to construct this API which I'm missing?
Besides the considerations that you've already listed:
The template version allows the client code to pass any iterator
range, for example a sub-range or reverse iterators, not just an entire container from begin to end.
The template version allows passing value types other than Foo. For this to be useful, the processing must be generic of course.
If the template works with only specific value type and the user tries to use iterators to wrong type, the error message might not be very descriptive of their mistake. If this is a concern, you can give the user a better error using type traits: static_assert(std::is_same<Iter::value_type, Foo>::value, "I want my Foo"); Until concepts proposal is added to the standard, there is no good way to communicate the requirements of a template type in the signature to the user.
There is also the option to provide both functions. The hard coded one can delegate to the templated version. This gives you the advantages of both versions at the expense of bloating your api.
It depends. If this function is going to be used with vectors for the time beeing why bother?
I suggest doing templated version only when it becomes necessary. Predicting such things in advance is hard.

STL container requierments

Does the standard require that some_container<T>::value_type be T?
I am asking because I am considering different approaches to implementing an STL-compliant 2d dynamic array. One of them is to have 2Darray<T>::value_type be 2Darray_row<T> or something like that, where the array would be iterated as a collection of rows (a little simplified. My actual implementation allows iteration in 3 directions)
The container requirements are a bit funky in the sense that they are actually not used by any generic algorithm. In that sense, it doesn't really matter much.
That said, the requirements are on the interface for containers not on how the container is actually instantiated. Even non-template classes can conform to the various requirements and, in fact, do. The requirement is that value_type is present; what it is defined to depends entirely on the container implementation.
Table 96 in ยง23.2.1 in the standard (c++11) requires a container class X containing objects of type T to return T for X::value_type.
So, if your some_container stores objects of type T, then value_type has to be T.
Either have a nested container (so colArray<rowArray<T> >) or have a single wrapping (2dArray<T>), but don't try to mix them. The nested approach allows you to use STL all the way down (vector<vector<T> >), but can be confusing and doesn't allow you column iterators etc, which you seem to want.
This SO answer addresses using ublas, and another suggests using Boost multi-arrays.
Generally, go for the STL or Boost option if you can. You are unlikely to write something as well by yourself.

What are good use-cases for tuples in C++11?

What are good use-cases for using tuples in C++11? For example, I have a function that defines a local struct as follows:
template<typename T, typename CmpF, typename LessF>
void mwquicksort(T *pT, int nitem, const int M, CmpF cmp, LessF less)
{
struct SI
{
int l, r, w;
SI() {}
SI(int _l, int _r, int _w) : l(_l), r(_r), w(_w) {}
} stack[40];
// etc
I was considering to replace the SI struct with an std::tuple<int,int,int>, which is a far shorter declaration with convenient constructors and operators already predefined, but with the following disadvantages:
Tuple elements are hidden in obscure, implementation-defined structs. Even though Visual studio interprets and shows their contents nicely, I still can't put conditional breakpoints that depend on value of tuple elements.
Accessing individual tuple fields (get<0>(some_tuple)) is far more verbose than accessing struct elements (s.l).
Accessing fields by name is far more informative (and shorter!) than by numeric index.
The last two points are somewhat addressed by the tie function. Given these disadvantages, what would be a good use-case for tuples?
UPDATE Turns out that VS2010 SP1 debugger cannot show the contents of the following array std::tuple<int, int, int> stack[40], but it works fine when it's coded with a struct. So the decision is basically a no-brainer: if you'll ever have to inspect its values, use a struct [esp. important with debuggers like GDB].
It is an easy way to return multiple values from a function;
std::tuple<int,int> fun();
The result values can be used elegantly as follows:
int a;
int b;
std::tie(a,b)=fun();
Well, imho, the most important part is generic code. Writing generic code that works on all kinds of structs is a lot harder than writing generics that work on tuples. For example, the std::tie function you mentioned yourself would be very nearly impossible to make for structs.
this allows you to do things like this:
Store function parameters for delayed execution (e.g. this question )
Return multiple parameters without cumbersome (un)packing with std::tie
Combine (not equal-typed) data sets (e.g. from parallel execution), it can be done as simply as std::tuple_cat.
The thing is, it does not stop with these uses, people can expand on this list and write generic functionality based on tuples that is much harder to do with structs. Who knows, maybe tomorrow someone finds a brilliant use for serialization purposes.
I think most use for tuples comes from std::tie:
bool MyStruct::operator<(MyStruct const &o) const
{
return std::tie(a, b, c) < std::tie(o.a, o.b, o.c);
}
Along with many other examples in the answers here. I find this example to be the most commonly useful, however, as it saves a lot of effort from how it used to be in C++03.
I think there is NO good use for tuples outside of implementation details of some generic library feature.
The (possible) saving in typing do not offset the losses in self-documenting properties of the resulting code.
Substituting tuples for structs that just takes away a meaningful name for a field, replacing the field name with a "number" (just like the ill-conceived concept of an std::pair).
Returning multiple values using tuples is much less self-documenting then the alternatives -- returning named types or using named references. Without this self-documenting, it is easy to confuse the order of the returned values, if they are mutually convertible.
Have you ever used std::pair? Many of the places you'd use std::tuple are similar, but not restricted to exactly two values.
The disadvantages you list for tuples also apply to std::pair, sometimes you want a more expressive type with better names for its members than first and second, but sometimes you don't need that. The same applies to tuples.
The real use cases are situations where you have unnameable elements- variadic templates and lambda functions. In both situations you can have unnamed elements with unknown types and thus the only way to store them is a struct with unnamed elements: std::tuple. In every other situation you have a known # of name-able elements with known types and can thus use an ordinary struct, which is the superior answer 99% of the time.
For example, you should NOT use std::tuple to have "multiple returns" from ordinary functions or templates w/ a fixed number of generic inputs. Use a real structure for that. A real object is FAR more "generic" than the std::tuple cookie-cutter, because you can give a real object literally any interface. It will also give you much more type safety and flexibility in public libraries.
Just compare these 2 class member functions:
std::tuple<double, double, double> GetLocation() const; // x, y, z
GeoCoordinate GetLocation() const;
With a real 'geo coordinate' object I can provide an operator bool() that returns false if the parent object had no location. Via its APIs users could get the x,y,z locations. But here's the big thing- if I decide to make GeoCoordinate 4D by adding a time field in 6 months, current users's code won't break. I cannot do that with the std::tuple version.
Interoperation with other programming languages that use tuples, and returning multiple values without having the caller have to understand any extra types. Those are the first two that come to my mind.
I cannot comment on mirk's answer, so I'll have to give a separate answer:
I think tuples were added to the standard also to allow for functional style programming. As an example, while code like
void my_func(const MyClass& input, MyClass& output1, MyClass& output2, MyClass& output3)
{
// whatever
}
is ubiquitous in traditional C++, because it is the only way to have multiple objects returned by a function, this is an abomination for functional programming. Now you may write
tuple<MyClass, MyClass, MyClass> my_func(const MyClass& input)
{
// whatever
return tuple<MyClass, MyClass, MyClass>(output1, output2, output3);
}
Thus having the chance to avoid side effects and mutability, to allow for pipelining, and, at the same time, to preserve the semantic strength of your function.
F.21: To return multiple "out" values, prefer returning a struct or tuple.
Prefer using a named struct where there are semantics to the returned value. Otherwise, a nameless tuple is useful in generic code.
For instance, if returned values are value from the input stream and the error code, these values will not ego far together. They are not related enough to justify a dedicated structure to hold both. Differently, x and y pair would rather have a structure like Point.
The source I reference is maintained by Bjarne Stroustrup, Herb Sutter so I think somewhat trustworthy.

What is std::pair?

What is std::pair for, why would I use it, and what benefits does boost::compressed_pair bring?
compressed_pair uses some template trickery to save space. In C++, an object (small o) can not have the same address as a different object.
So even if you have
struct A { };
A's size will not be 0, because then:
A a1;
A a2;
&a1 == &a2;
would hold, which is not allowed.
But many compilers will do what is called the "empty base class optimization":
struct A { };
struct B { int x; };
struct C : public A { int x; };
Here, it is fine for B and C to have the same size, even if sizeof(A) can't be zero.
So boost::compressed_pair takes advantage of this optimization and will, where possible, inherit from one or the other of the types in the pair if it is empty.
So a std::pair might look like (I've elided a good deal, ctors etc.):
template<typename FirstType, typename SecondType>
struct pair {
FirstType first;
SecondType second;
};
That means if either FirstType or SecondType is A, your pair<A, int> has to be bigger than sizeof(int).
But if you use compressed_pair, its generated code will look akin to:
struct compressed_pair<A,int> : private A {
int second_;
A first() { return *this; }
int second() { return second_; }
};
And compressed_pair<A,int> will only be as big as sizeof(int).
std::pair is a data type for grouping two values together as a single object. std::map uses it for key, value pairs.
While you're learning pair, you might check out tuple. It's like pair but for grouping an arbitrary number of values. tuple is part of TR1 and many compilers already include it with their Standard Library implementations.
Also, checkout Chapter 1, "Tuples," of the book The C++ Standard Library Extensions: A Tutorial and Reference by Pete Becker, ISBN-13: 9780321412997, for a thorough explanation.
You sometimes need to return 2 values from a function, and it's often overkill to go and create a class just for that.
std:pair comes in handy in those cases.
I think boost:compressed_pair is able to optimize away the members of size 0.
Which is mostly useful for heavy template machinery in libraries.
If you do control the types directly, it's irrelevant.
It can sound strange to hear that compressed_pair cares about a couple of bytes. But it can actually be important when one considers where compressed_pair can be used. For example let's consider this code:
boost::function<void(int)> f(boost::bind(&f, _1));
It can suddenly have a big impact to use compressed_pair in cases like above. What could happen if boost::bind stores the function pointer and the place-holder _1 as members in itself or in a std::pair in itself? Well, it could bloat up to sizeof(&f) + sizeof(_1). Assuming a function pointer has 8 bytes (not uncommon especially for member functions) and the placeholder has one byte (see Logan's answer for why), then we could have needed 9 bytes for the bind object. Because of aligning, this could bloat up to 12 bytes on a usual 32bit system.
boost::function encourages its implementations to apply a small object optimization. That means that for small functors, a small buffer directly embedded in the boost::function object is used to store the functor. For larger functors, the heap would have to be used by using operator new to get memory. Around boost version 1.34, it was decided to adopt this optimization, because it was figured one could gain some very great performance benefits.
Now, a reasonable (yet, maybe still quite small) limit for such a small buffer would be 8 bytes. That is, our quite simple bind object would not fit into the small buffer, and would require operator new to be stored. If the bind object above would use a compressed_pair, it can actually reduce its size to 8 bytes (or 4 bytes for non-member function pointer often), because the placeholder is nothing more than an empty object.
So, what may look like just wasting a lot of thought for just only a few bytes actually can have a significant impact on performance.
It's standard class for storing a pair of values. It's returned/used by some standard functions, like std::map::insert.
boost::compressed_pair claims to be more efficient: see here
std::pair comes in handy for a couple of the other container classes in the STL.
For example:
std::map<>
std::multimap<>
Both store std::pairs of keys and values.
When using the map and multimap, you often access the elements using a pointer to a pair.
Additional info: boost::compressed_pair is useful when one of the pair's types is an empty struct. This is often used in template metaprogramming when the pair's types are programmatically inferred from other types. At then end, you usually have some form of "empty struct".
I would prefer std::pair for any "normal" use, unless you are into heavy template metaprogramming.
It's nothing but a structure with two variables under the hood.
I actually dislike using std::pair for function returns. The reader of the code would have to know what .first is and what .second is.
The compromise I use sometimes is to immediately create constant references to .first and .second, while naming the references clearly.
What is std::pair for, why would I use it?
It is just as simple two elements tuple. It was defined in first version of STL in times when compilers were not widely supporting templates and metaprogramming techniques which would be required to implement more sophisticated type of tuple like Boost.Tuple.
It is useful in many situations. std::pair is used in standard associative containers. It can be used as a simple form of range std::pair<iterator, iterator> - so one may define algorithms accepting single object representing range instead of two iterators separately.
(It is a useful alternative in many situations.)
Sometimes there are two pieces of information that you just always pass around together, whether as a parameter, or a return value, or whatever. Sure, you could write your own object, but if it's just two small primitives or similar, sometimes a pair seems just fine.