How do I declare an arbitrary depth of pair of pairs? - c++

A pair of pair of ints can be declared as: std::pair<int, std::pair<int, int> > A;
Similarly a pair of pair of pair of ints as std::pair<int, std::pair<int, std::pair<int, int> > >A;
I want to declare an arbitrary "pair of pairs" in my code. i.e., Depending on some value (known only at runtime), I want to have either a pair of pair of ints (n = 1) or pair of pair of pair of ints (n = 2) and so on. Was wondering how do I do it efficiently in C++?
Below is a snippet code in Python:
import numpy as np
n = 4 # a value known at runtime
m = 2 # a value known at runtime
def PP(A, j):
A_s = []
if j == n-1:
for i in range(1, m):
A_s.append((i, A[i]))
else:
for i in range(1, m):
A_c = A[i]
A_s.append((i, PP(A_c, j+1)))
return A_s
j = 0
# The dimension of A is known at runtime.
# Will have to create np.ones((m, m, m, m, m)) if n = 5
A = np.ones((m, m, m, m))
B = PP(A, 0)

Unlike Python, C++ is a statically-typed language. So if the structure or size of what you want to store isn't known until run time, you can't use the type itself, like nested pairs, to describe the specific structure. Instead what you do is use C++ types that can resize dynamically. Namely, std::vector<int> is an idiomatic and efficient way to store a dynamic number of ints in C++.
If you really want a tree-like structure as in your Python example ([(1, [(1, [(1, [(1, 1.0)])])])]), this is possible in C++, too. But it's a bit more work. See for instance Binary Trees in C++: Part 1.

C++ (see this website and n3337, the C++11 draft standard) has both std::variant and std::optional (in C++17). It seems that you want have some tagged union (like the abstract syntax trees inside C++ compilers) and smart pointers (e.g. std::unique_ptr).
Maybe you want something like
class Foo; // forward declaration
class Foo {
std::variant<std::unique_ptr<Foo>, std::pair<int,int>> fields;
/// many other things, in particular for the C++ rule of five
};
In contrast to Python, each C++ value has a known type.
You could combine both cleverly, and you might be interested by other standard containers (in particular std::vector). Please read a good C++ programming book and be aware of the rule of five.
Take inspiration from existing open source software on github or gitlab (such as Qt, RefPerSys, Wt, Fish, Clang static analyzer, GCC, and many others)
Read also the documentation of your C++ compiler (e.g. GCC) and of your debugger (e.g. GDB). If you use a recent GCC, compile with all warnings and debug info, e.g. g++ -Wall -Wextra -g. Document by writing your coding conventions, and for a large enough codebase, consider writing your GCC plugin to enforce them. A recent GCC (so GCC 10 in October 2020) has interesting static analysis options, that you could try.
Consider using the Clang static analyzer (see also this draft report, and the DECODER and CHARIOT European projects, and MILEPOST GCC) on your C++ code base.
Read also papers about C++ in recent ACM SIGPLAN conferences.
You could also define your class Matrix, or use existing libraries providing them (e.g. boost), perhaps adapting this answer to C++.
Remember that the sizeof every C++ type is known at compile time. AFAIK, C++ don't have any flexible array members (like C does).

As others have already said, you cannot do it. Not at runtime, at least.
However it seems also not useful in general. Let's say you have the following type (like yours, but with a know level of nesting at compile time) and you declare an object of that type
using your_type = std::pair<int, std::pair<int, std::pair<int, int>>>
your_type obj1{1,{2,{3,4}}};
This no different from
std::array<int,4> obj2{1,2,3,4};
where you have this correspondence:
obj1.first == obj2[0]
obj1.second.first == obj2[1]
obj1.second.second.first == obj2[2]
obj1.second.second.second == obj2[3]
And to stress that they are actually the same thing, think about how you would implement a size function for such a type; it could be a metafunction like this (probably there are better ways to do it):
template<typename T, typename Ts>
constexpr int size(std::pair<T,Ts>) {
if constexpr (std::is_same_v<T,Ts>)
return 2;
else
return 1 + size(Ts{});
}
In other words they are isomorphic.
Now your requirement is to write your_type in such a way that the level of nesting is known at run time. As already said, that's not possible, but guess what is isomorphic to the type you imagine? std::vector<int>!
So my answer is: you don't need that; just use std::vector instead.
As a final comment, based on my answer it seems reasonable that, if you really want a type which behaves like an arbitrarily nested pair, you can just write a wrapper around std::vector. But I don't see why you might really need to do it.

Related

In C++ get smallest integer type that can hold given amount of bits

If I have compile time constant num_bits how can I get smallest integer type that can hold this amount of bits?
Of course I can do:
Try it online!
#include <cstdint>
#include <type_traits>
std::size_t constexpr num_bits = 19;
using T =
std::conditional_t<num_bits <= 8, uint8_t,
std::conditional_t<num_bits <= 16, uint16_t,
std::conditional_t<num_bits <= 32, uint32_t,
std::conditional_t<num_bits <= 64, uint64_t,
void>>>>;
But maybe there exists some ready made meta function in standard library for achieving this goal?
I created this question only to find out single meta function specifically from standard library. But if you have other nice suggestions how to solve this task besides my proposed above solution, then please post such solutions too...
Update. As suggested in comments, uint_leastX_t should be used instead of uintX_t everywhere in my code above, as uintX_t may not exist on some platforms, while uint_leastX_t always exist.
There is currently no template type for this in the c++ standard library.
The standard library does have something to do the reverse of what you're looking for (std::numeric_limits<T>::max).
There is a proposal to add such functionality to the c++ standard library in p0102: "C++ Parametric Number Type Aliases". A similar question to yours has been asked on lists.isocpp.org/std-proposals, where p0102 was mentioned.
In case you're interested in / okay with using Boost, Boost has boost::int_max_value_t<V>, which is a struct defining type members least (smallest) and fast ("easiest to manipulate").
No, there is no ready made meta function for this

How to write functions like these intrinsic ones?

I just started to learn Fortran and I found that some intrinsic functions look mysteriously for me.
One of them is ALLOCATE: ALLOCATE(array(-5:4, 1:10)). If I want to write a similar function, how would it look? What is its argument? Which type it will have? Because it's not obvious to me what is array(-5:4, 1:10)? array still is not allocated, so what does this expression mean? What is its type?! What will be a difference in array(10) and array(-5:4, 1:10) as a type? Is it some hidden preallocated "meta-object" with some internal attribute like "dimension"? At least it does not look like array pointer in C.
And the next mysterious functions example is the PACK: pack(m, m /= 0). First, I thought that m /= 0 is like a function pointer, i.e. lambda, like in Python pack(m, lambda el: el != 0) or in Haskell pack m (\el -> el /= 0). But then I read somewhere in the Web that it's not a lambda but a list of booleans, once per each m item. But this means that it's very inefficient code - it eats a lot of memory if m is big! So, I cannot understand how do these intrinsic functions work, even more, I have feeling that user cannot write such functions - they are coded in C and not in Fortran itself. Is it truth? How they were written?!
Allocate is not a library function, as pointed out by #dave_thompson_085. Not only that, there are also some types of actual intrinsic functions that cannot be written by the user. Like min(), max(), transfer(). They are not just "library" functions, they are "intrisic", part of the core language and hence can do stuff that user code cannot. They are written in any language that was used to write the compiler. Mostly C, but could probably be also implemented in Fortran - just not like a normal Fortran function, but as a feature the compiler inserts.
When it comes to functions that accept a mask, like PACK, but there are many others that accept it in an optional argument, the mask is a logical array. The compiler is free to implement any optimizations to avoid allocating such an array, but these optimizations cannot be guaranteed. It is not just a library function that is called in a straightforward way, the compiler can insert any code that does what the function is supposed to do.

vector<double>::size_type versus alternatives

My background is mostly in R, SAS, and VBA, and I'm trying to learn some C++. I've picked "Accelerated C++" (Koenig, Moo) as one of my first books on the subject. My theoretical background in comp. sci. is admittedly not the most robust, which perhaps explains why I'm confused by points like these.
I have a question about a piece of code similar to the following:
#include <iostream>
#include <vector>
int main() {
double input;
std::vector<double> example_vector;
while (std::cin >> input) {
example_vector.push_back(input);
}
std::vector<double>::size_type vector_size;
vector_size = example_vector.size();
return 0;
}
As I understand it, vector_size is "large enough" to hold the size of example_vector, no matter how large example_vector might be. I'm not sure I understand what this means: is vector_size (in this case) capable of representing an integer larger than, say, long long x;, so that std::cout << vector_size; would print a value that's different from std::cout << x;? How/why?
What this question boils down to is that the standard does not mandate what actual type is returned by the vector<T>::size() method. Different implementations may make different choices.
So, if you wish to assign the value returned by a call to size() to a variable, what type should you use for that variable? In order to write code that is portable across different implementations, you need a way to name that type, recognising the fact that different implementations of the standard library may use different types.
The answer is that vector<T> provides the type that you should use. It is
vector<T>::size_type
One thing that you do need to understand, and get used to, with C++, is that the standard does need to cater for significant variation between different implementations.
std::vector<T>::size_type is exactly what the standard implies, the type currently used by your implementation to allow storing the size.
Historically architectures have changed, requiring code parsers, retesting and other basic wastes of time. By following the standards, your code will work on any complient platform, now and in the future.
So your code can work on 16bit archetectures and theoretically 256bit architectures at such time as they are supported.
So, while you are most likely only going to see it as size_t if you change platforms, you don't have to worry. And more importantly, whoever is maintaining your code doesn't have to either.
In the vector class (and many others) they have a typedef:
typedef **implementation_defined** size_type;
A "size_type" may or may not use size_t. That said, size_t is always large enough to cover all your memory (32 bits or 64 bit as might be.) Note that it is likely that size_t will always be large enough to cover all your memory (at least as long as you program on your desktop computer), but that is not guaranteed.
Additionally, to know the size of a variable, and thus how much it can hold, you use sizeof.
So this would give you the actual size of the size_type type:
std::cout << sizeof(std::vector<double>::size_type) << std::endl;
which is going to be 4 or 8 depending on your processor and compiler. The maximum size is defined as 2 power that sizeof x 8 (i.e. 2 power 32 or 2 power 64.)

Template Metaprogramming - I still don't get it :(

I have a problem... I don't understand template metaprogramming.
The problem is, that I’ve read a lot about it, but it still doesn’t make much sense to me.
Fact nr.1: Template Metaprogramming is faster
template <int N>
struct Factorial
{
enum { value = N * Factorial<N - 1>::value };
};
template <>
struct Factorial<0>
{
enum { value = 1 };
};
// Factorial<4>::value == 24
// Factorial<0>::value == 1
void foo()
{
int x = Factorial<4>::value; // == 24
int y = Factorial<0>::value; // == 1
}
So this metaprogram is faster ... because of the constant literal.
BUT: Where in the real world do we have constant literals? Most programs I use react on user input.
FACT nr. 2 : Template metaprogramming can accomplish better maintainability.
Yeah, the factorial example may be maintainable, but when it comes to complex functions, I and most other C++ programmers can't read them.
Also, the debugging options are very poor (or at least I don't know how to debug).
When does template metaprogramming make sense?
Just as factorial is not a realistic example of recursion in non-functional languages, neither is it a realistic example of template metaprogramming. It's just the standard example people reach for when they want to show you recursion.
In writing templates for realistic purposes, such as in everyday libraries, often the template has to adapt what it does depending on the type parameters it is instantiated with. This can get quite complex, as the template effectively chooses what code to generate, conditionally. This is what template metaprogramming is; if the template has to loop (via recursion) and choose between alternatives, it is effectively like a small program that executes during compilation to generate the right code.
Here's a really nice tutorial from the boost documentation pages (actually excerpted from a brilliant book, well worth reading).
http://www.boost.org/doc/libs/1_39_0/libs/mpl/doc/tutorial/representing-dimensions.html
I use template mete-programming for SSE swizzling operators to optimize shuffles during compile time.
SSE swizzles ('shuffles') can only be masked as a byte literal (immediate value), so we created a 'mask merger' template class that merges masks during compile time for when multiple shuffle occur:
template <unsigned target, unsigned mask>
struct _mask_merger
{
enum
{
ROW0 = ((target >> (((mask >> 0) & 3) << 1)) & 3) << 0,
ROW1 = ((target >> (((mask >> 2) & 3) << 1)) & 3) << 2,
ROW2 = ((target >> (((mask >> 4) & 3) << 1)) & 3) << 4,
ROW3 = ((target >> (((mask >> 6) & 3) << 1)) & 3) << 6,
MASK = ROW0 | ROW1 | ROW2 | ROW3,
};
};
This works and produces remarkable code without generated code overhead and little extra compile time.
so this Metaprogram is faster ... beacause of the Constant Literal.
BUT : Where in the real World do we have constant Literals ?
Most programms i use react on user input.
That's why it's hardly ever used for values. Usually, it is used on types. using types to compute and generate new types.
There are many real-world uses, some of which you're already familiar with even if you don't realize it.
One of my favorite examples is that of iterators. They're mostly designed just with generic programming, yes, but template metaprogramming is useful in one place in particular:
To patch up pointers so they can be used as iterators. An iterator must expose a handful of typedef's, such as value_type. Pointers don't do that.
So code such as the following (basically identical to what you find in Boost.Iterator)
template <typename T>
struct value_type {
typedef typename T::value_type type;
};
template <typename T>
struct value_type<T*> {
typedef T type;
};
is a very simple template metaprogram, but which is very useful. It lets you get the value type of any iterator type T, whether it is a pointer or a class, simply by value_type<T>::type.
And I think the above has some very clear benefits when it comes to maintainability. Your algorithm operating on iterators only has to be implemented once. Without this trick, you'd have to make one implementation for pointers, and another for "proper" class-based iterators.
Tricks like boost::enable_if can be very valuable too. You have an overload of a function which should be enabled for a specific set of types only. Rather than defining an overload for each type, you can use metaprogramming to specify the condition and pass it to enable_if.
Earwicker already mentioned another good example, a framework for expressing physical units and dimensions. It allows you to express computations like with physical units attached, and enforces the result type. Multiplying meters by meters yields a number of square meters. Template metaprogramming can be used to automatically produce the right type.
But most of the time, template metaprogramming is used (and useful) in small, isolated cases, basically to smooth out bumps and exceptional cases, to make a set of types look and behave uniformly, allowing you to use generic programming more efficiently
Seconding the recommendation for Alexandrescu's Modern C++ Design.
Templates really shine when you're writing a library that has pieces which can be assembled combinatorically in a "choose a Foo, a Bar and a Baz" approach, and you expect users to make use of these pieces in some form that is fixed at compile time. For example, I coauthored a data mining library that uses template metaprogramming to let the programmer decide what DecisionType to use (classification, ranking or regression), what InputType to expect (floats, ints, enumerated values, whatever), and what KernelMethod to use (it's a data mining thing). We then implemented several different classes for each category, such that there were several dozen possible combinations.
Implementing 60 separate classes to do this would have involved a lot of annoying, hard-to-maintain code duplication. Template metaprogramming meant that we could implement each concept as a code unit, and give the programmer a simple interface for instantiating combinations of these concepts at compile-time.
Dimensional analysis is also an excellent example, but other people have covered that.
I also once wrote some simple compile-time pseudo-random number generators just to mess with people's heads, but that doesn't really count IMO.
The factorial example is about as useful for real-world TMP as "Hello, world!" is for common programming: It's there to show you a few useful techniques (recursion instead of iteration, "else-if-then" etc.) in a very simple, relatively easy to understand example that doesn't have much relevance for your every-day coding. (When was the last time you needed to write a program that emitted "Hello, world"?)
TMP is about executing algorithms at compile-time and this implies a few obvious advantages:
Since these algorithms failing means your code doesn't compile, failing algorithms never make it to your customer and thus can't fail at the customer's. For me, during the last decade this was the single-most important advantage that led me to introduce TMP into the code of the companies I worked for.
Since the result of executing template-meta programs is ordinary code that's then compiled by the compiler, all advantages of code generating algorithms (reduced redundancy etc.) apply.
Of course, since they are executed at compile-time, these algorithms won't need any run-time and will thus run faster. TMP is mostly about compile-time computing with a few, mostly small, inlined functions sprinkled in between, so compilers have ample opportunities to optimize the resulting code.
Of course, there's disadvantages, too:
The error messages can be horrible.
There's no debugging.
The code is often hard to read.
As always, you'll just have to weight the advantages against the disadvantages in every case.
As for a more useful example: Once you have grasped type lists and basic compile-time algorithms operating on them, you might understand the following:
typedef
type_list_generator< signed char
, signed short
, signed int
, signed long
>::result_type
signed_int_type_list;
typedef
type_list_find_if< signed_int_type_list
, exact_size_predicate<8>
>::result_type
int8_t;
typedef
type_list_find_if< signed_int_type_list
, exact_size_predicate<16>
>::result_type
int16_t;
typedef
type_list_find_if< signed_int_type_list
, exact_size_predicate<32>
>::result_type
int32_t;
This is (slightly simplified) actual code I wrote a few weeks ago. It will pick the appropriate types from a type list, replacing the #ifdef orgies common in portable code. It doesn't need maintenance, works without adaption on every platform your code might need to get ported to, and emits a compile error if the current platform doesn't have the right type.
Another example is this:
template< typename TFunc, typename TFwdIter >
typename func_traits<TFunc>::result_t callFunc(TFunc f, TFwdIter begin, TFwdIter end);
Given a function f and a sequence of strings, this will dissect the function's signature, convert the strings from the sequence into the right types, and call the function with these objects. And it's mostly TMP inside.
Here's one trivial example, a binary constant converter, from a previous question here on StackOverflow:
C++ binary constant/literal
template< unsigned long long N >
struct binary
{
enum { value = (N % 10) + 2 * binary< N / 10 > :: value } ;
};
template<>
struct binary< 0 >
{
enum { value = 0 } ;
};
TMP does not necessarily mean faster or more maintainable code. I used the boost spirit library to implement a simple SQL expression parser that builds an evaluation tree structure. While the development time was reduced since I had some familiarity with TMP and lambda, the learning curve is a brick wall for "C with classes" developers, and the performance is not as good as a traditional LEX/YACC.
I see Template Meta Programming as just another tool in my tool-belt. When it works for you use it, if it doesn't, use another tool.
Scott Meyers has been working on enforcing code constraints using TMP.
Its quite a good read:
http://www.artima.com/cppsource/codefeatures.html
In this article he introduces the concepts of Sets of Types (not a new concept but his work is based ontop of this concept). Then uses TMP to make sure that no matter what order you specify the members of the set that if two sets are made of the same members then they are equavalent. This requires that he be able to sort and re-order a list of types and compare them dynamically thus generating compile time errors when they do not match.
I suggest you read Modern C++ Design by Andrei Alexandrescu - this is probably one of the best books on real-world uses of C++ template metaprogramming; and describes many problems which C++ templates are an excellent solution.
TMP can be used from anything like ensuring dimensional correctness (Ensuring that mass cannot be divided by time, but distance can be divided by time and assigned to a velocity variable) to optimizing matrix operations by removing temporary objects and merging loops when many matrices are involved.
'static const' values work as well. And pointers-to-member. And don't forget the world of types (explicit and deduced) as compile-time arguments!
BUT : Where in the real World do we have constant Literals ?
Suppose you have some code that has to run as fast as possible. It contains the critical inner loop of your CPU-bound computation in fact. You'd be willing to increase the size of your executable a bit to make it faster. It looks like:
double innerLoop(const bool b, const vector<double> & v)
{
// some logic involving b
for (vector::const_iterator it = v.begin; it != v.end(); ++it)
{
// significant logic involving b
}
// more logic involving b
return ....
}
The details aren't important, but the use of 'b' is pervasive in the implementation.
Now, with templates, you can refactor it a bit:
template <bool b> double innerLoop_B(vector<double> v) { ... same as before ... }
double innerLoop(const bool b, const vector<double> & v)
{ return b ? innerLoop_templ_B<true>(v) : innerLoop_templ_B<false>(v) ); }
Any time you have a relatively small, discrete, set of values for a parameter you can automatically instantiate separate versions for them.
Consider the possiblities when 'b' is based on the CPU detection. You can run a differently-optimized set of code depending on run-time detection. All from the same source code, or you can specialize some functions for some sets of values.
As a concrete example, I once saw some code that needed to merge some integer coordinates. Coordinate system 'a' was one of two resolutions (known at compile time), and coordinate system 'b' was one of two different resolutions (also known at compile time). The target coordinate system needed to be the least common multiple of the two source coordinate systems. A library was used to compute the LCM at compile time and instantiate code for the different possibilities.

C++ valarray vs. vector

I like vectors a lot. They're nifty and fast. But I know this thing called a valarray exists. Why would I use a valarray instead of a vector? I know valarrays have some syntactic sugar, but other than that, when are they useful?
valarray is kind of an orphan that was born in the wrong place at the wrong time. It's an attempt at optimization, fairly specifically for the machines that were used for heavy-duty math when it was written -- specifically, vector processors like the Crays.
For a vector processor, what you generally wanted to do was apply a single operation to an entire array, then apply the next operation to the entire array, and so on until you'd done everything you needed to do.
Unless you're dealing with fairly small arrays, however, that tends to work poorly with caching. On most modern machines, what you'd generally prefer (to the extent possible) would be to load part of the array, do all the operations on it you're going to, then move on to the next part of the array.
valarray is also supposed to eliminate any possibility of aliasing, which (at least theoretically) lets the compiler improve speed because it's more free to store values in registers. In reality, however, I'm not at all sure that any real implementation takes advantage of this to any significant degree. I suspect it's rather a chicken-and-egg sort of problem -- without compiler support it didn't become popular, and as long as it's not popular, nobody's going to go to the trouble of working on their compiler to support it.
There's also a bewildering (literally) array of ancillary classes to use with valarray. You get slice, slice_array, gslice and gslice_array to play with pieces of a valarray, and make it act like a multi-dimensional array. You also get mask_array to "mask" an operation (e.g. add items in x to y, but only at the positions where z is non-zero). To make more than trivial use of valarray, you have to learn a lot about these ancillary classes, some of which are pretty complex and none of which seems (at least to me) very well documented.
Bottom line: while it has moments of brilliance, and can do some things pretty neatly, there are also some very good reasons that it is (and will almost certainly remain) obscure.
Edit (eight years later, in 2017): Some of the preceding has become obsolete to at least some degree. For one example, Intel has implemented an optimized version of valarray for their compiler. It uses the Intel Integrated Performance Primitives (Intel IPP) to improve performance. Although the exact performance improvement undoubtedly varies, a quick test with simple code shows around a 2:1 improvement in speed, compared to identical code compiled with the "standard" implementation of valarray.
So, while I'm not entirely convinced that C++ programmers will be starting to use valarray in huge numbers, there are least some circumstances in which it can provide a speed improvement.
Valarrays (value arrays) are intended to bring some of the speed of Fortran to C++. You wouldn't make a valarray of pointers so the compiler can make assumptions about the code and optimise it better. (The main reason that Fortran is so fast is that there is no pointer type so there can be no pointer aliasing.)
Valarrays also have classes which allow you to slice them up in a reasonably easy way although that part of the standard could use a bit more work. Resizing them is destructive and they lack iterators they have iterators since C++11.
So, if it's numbers you are working with and convenience isn't all that important use valarrays. Otherwise, vectors are just a lot more convenient.
During the standardization of C++98, valarray was designed to allow some sort of fast mathematical computations. However, around that time Todd Veldhuizen invented expression templates and created blitz++, and similar template-meta techniques were invented, which made valarrays pretty much obsolete before the standard was even released. IIRC, the original proposer(s) of valarray abandoned it halfway into the standardization, which (if true) didn't help it either.
ISTR that the main reason it wasn't removed from the standard is that nobody took the time to evaluate the issue thoroughly and write a proposal to remove it.
Please keep in mind, however, that all this is vaguely remembered hearsay. Take this with a grain of salt and hope someone corrects or confirms this.
I know valarrays have some syntactic sugar
I have to say that I don't think std::valarrays have much in way of syntactic sugar. The syntax is different, but I wouldn't call the difference "sugar." The API is weird. The section on std::valarrays in The C++ Programming Language mentions this unusual API and the fact that, since std::valarrays are expected to be highly optimized, any error messages you get while using them will probably be non-intuitive.
Out of curiosity, about a year ago I pitted std::valarray against std::vector. I no longer have the code or the precise results (although it shouldn't be hard to write your own). Using GCC I did get a little performance benefit when using std::valarray for simple math, but not for my implementations to calculate standard deviation (and, of course, standard deviation isn't that complex, as far as math goes). I suspect that operations on each item in a large std::vector play better with caches than operations on std::valarrays. (NOTE, following advice from musiphil, I've managed to get almost identical performance from vector and valarray).
In the end, I decided to use std::vector while paying close attention to things like memory allocation and temporary object creation.
Both std::vector and std::valarray store the data in a contiguous block. However, they access that data using different patterns, and more importantly, the API for std::valarray encourages different access patterns than the API for std::vector.
For the standard deviation example, at a particular step I needed to find the collection's mean and the difference between each element's value and the mean.
For the std::valarray, I did something like:
std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> temp(mean, original_values.size());
std::valarray<double> differences_from_mean = original_values - temp;
I may have been more clever with std::slice or std::gslice. It's been over five years now.
For std::vector, I did something along the lines of:
std::vector<double> original_values = ... // obviously, I put something here
double mean = std::accumulate(original_values.begin(), original_values.end(), 0.0) / original_values.size();
std::vector<double> differences_from_mean;
differences_from_mean.reserve(original_values.size());
std::transform(original_values.begin(), original_values.end(), std::back_inserter(differences_from_mean), std::bind1st(std::minus<double>(), mean));
Today I would certainly write that differently. If nothing else, I would take advantage of C++11 lambdas.
It's obvious that these two snippets of code do different things. For one, the std::vector example doesn't make an intermediate collection like the std::valarray example does. However, I think it's fair to compare them because the differences are tied to the differences between std::vector and std::valarray.
When I wrote this answer, I suspected that subtracting the value of elements from two std::valarrays (last line in the std::valarray example) would be less cache-friendly than the corresponding line in the std::vector example (which happens to also be the last line).
It turns out, however, that
std::valarray<double> original_values = ... // obviously I put something here
double mean = original_values.sum() / original_values.size();
std::valarray<double> differences_from_mean = original_values - mean;
Does the same thing as the std::vector example, and has almost identical performance. In the end, the question is which API you prefer.
valarray was supposed to let some FORTRAN vector-processing goodness rub off on C++. Somehow the necessary compiler support never really happened.
The Josuttis books contains some interesting (somewhat disparaging) commentary on valarray (here and here).
However, Intel now seem to be revisiting valarray in their recent compiler releases (e.g see slide 9); this is an interesting development given that their 4-way SIMD SSE instruction set is about to be joined by 8-way AVX and 16-way Larrabee instructions and in the interests of portability it'll likely be much better to code with an abstraction like valarray than (say) intrinsics.
I found one good usage for valarray.
It's to use valarray just like numpy arrays.
auto x = linspace(0, 2 * 3.14, 100);
plot(x, sin(x) + sin(3.f * x) / 3.f + sin(5.f * x) / 5.f);
We can implement above with valarray.
valarray<float> linspace(float start, float stop, int size)
{
valarray<float> v(size);
for(int i=0; i<size; i++) v[i] = start + i * (stop-start)/size;
return v;
}
std::valarray<float> arange(float start, float step, float stop)
{
int size = (stop - start) / step;
valarray<float> v(size);
for(int i=0; i<size; i++) v[i] = start + step * i;
return v;
}
string psstm(string command)
{//return system call output as string
string s;
char tmp[1000];
FILE* f = popen(command.c_str(), "r");
while(fgets(tmp, sizeof(tmp), f)) s += tmp;
pclose(f);
return s;
}
string plot(const valarray<float>& x, const valarray<float>& y)
{
int sz = x.size();
assert(sz == y.size());
int bytes = sz * sizeof(float) * 2;
const char* name = "plot1";
int shm_fd = shm_open(name, O_CREAT | O_RDWR, 0666);
ftruncate(shm_fd, bytes);
float* ptr = (float*)mmap(0, bytes, PROT_WRITE, MAP_SHARED, shm_fd, 0);
for(int i=0; i<sz; i++) {
*ptr++ = x[i];
*ptr++ = y[i];
}
string command = "python plot.py ";
string s = psstm(command + to_string(sz));
shm_unlink(name);
return s;
}
Also, we need python script.
import sys, posix_ipc, os, struct
import matplotlib.pyplot as plt
sz = int(sys.argv[1])
f = posix_ipc.SharedMemory("plot1")
x = [0] * sz
y = [0] * sz
for i in range(sz):
x[i], y[i] = struct.unpack('ff', os.read(f.fd, 8))
os.close(f.fd)
plt.plot(x, y)
plt.show()
The C++11 standard says:
The valarray array classes are defined to be free of certain forms of
aliasing, thus allowing operations on these classes to be optimized.
See C++11 26.6.1-2.
With std::valarray you can use the standard mathematical notation like v1 = a*v2 + v3 out of the box. This is not possible with vectors unless you define your own operators.
std::valarray is intended for heavy numeric tasks, such as Computational Fluid Dynamics or Computational Structure Dynamics, in which you have arrays with millions, sometimes tens of millions of items, and you iterate over them in a loop with also millions of timesteps. Maybe today std::vector has a comparable performance but, some 15 years ago, valarray was almost mandatory if you wanted to write an efficient numeric solver.