Assume you have two arrays of equal dimension and want to construct a larger one which holds the maximum. Of course you use the in-built max and you don't need an explicite index. But now assume the new number is not computable by a vectorizable construct and you have to compute an if statement inside a do loop for each index. The compiler can parallelize that anyway, I bet (the result shall just depend on the current loop index). But is a construct like IF (a(:).EQ.b(:)) THEN c(:)=... without explicite index possible?
Related
I need to create a multidimensional matrix of randomly distributed numbers using a Gaussian distribution, and am trying to keep the program as optimized as possible. Currently I am using Boost matrices, but I can't seem to find anything that accomplishes this without manually looping. Ideally, I would like something similar to Python's numpy.random.randn() function, but this must be done in C++. Is there another way to accomplish this that is faster than manually looping?
You're going to have to loop anyway, but you can eliminate the array lookup inside your loop. True N-dimensional array indexing is going to be expensive, so you best option is any library (or written yourself) which also provides you with an underlying linear data store.
You can then loop over the entire n-dimensional array as if it was linear, avoiding many multiplications of the indexes by the dimensions.
Another optimization is to do away with the index altogether, and take a pointer to the first element, then iterate the pointer itself, this does away with a whole variable in the CPU which can give the compiler more space for other things. e.g. if you had 1000 elements in a vector:
vector<int> data;
data.resize(1000);
int *intPtr = &data[0];
int *endPtr = &data[0] + 1000;
while(intPtr != endPtr)
{
(*intPtr) == rand_function();
++intPtr;
}
Here, two tricks have happened. Pre-calculate the end condition outside the loop itself (this avoids a lookup of a function such as vector::size() 1000 times), and working with pointers to the data in memory rather than indexes. An index gets internally converted to a pointer every time it's used to access the array. By storing the "current pointer" and adding 1 to that each time, then the cost of calculating the pointers from indexes 1000 times is eliminated.
This can be faster but it depends on the implementation. Compilers can do some of the same hand-optimizations, but not all of them. The rand_function should also be inline to avoid the function call overhead.
A warning however: if you use std::vector with the pointer trick then it's not thread safe, if another thread changed the vector's length during the loop then the vector can get reallocated to a different place in memory. Don't do pointer tricks unless you'd be perfectly comfortable writing your own vector, array, table classes as needed.
Let us say that I have a 2D matrix, given by vector<vector<double>> matrix, and that matrix has already been initialized to have R rows and C columns.
There is also a list-of-coordinates (made up of say, N (x,y) pairs), that we are going to process, such that for each co-ordinate, we get a mapping to a particular row (r) and column (c) in our matrix. So, we basically have [r, c] = f(x,y). The particularities of the mapping function f are not important. However, what we want to do, is keep track of the rows r and columns c that are used, by inserting them into another list, called list-of-indicies.
The problem is that, I do not want to keep adding the same r and c to the list if that (r,c) pair already exists in that list. The brute force method would be to simply scan the entire list-of-indicies every time I want to check, but that is going to be very time consuming.
For example, if we have the co-ordinate (x=4, y=5), this yields (r=2, c=6). So, we now add (r=2, c=6) to the list-of-indicies. Now we get a new point, given by (x=-2, y=10). This also ends up falling under (r=2, c=6). However, since I have already added (r=2, c=6) to my list, I do not want to add it again! But without doing a brute-force scan of the list-of-indicies, is there a better way?
You would need a map to do that.
In case you use c++11 you can use unordered_map which is a hashmap and has a constant time lookup, in case you use an older version of c++ you can use the standard map, which is a treemap, and has a logarithmic lookup.
The performance difference won't be big, if you don't have many items.
Instead of the map or unordered_map you could simply use a matrix vector<vector<bool>> with the same R and C as your other matrix with every field initialized to false.
Instead of adding and (r,c) pair to a list you simply set the corresponding boolean in the matrix to true.
What is the complexity of boost::multi_array reshape() function? I expect it to be O(1) but I can't find this info in the documentation. The documentation for this library is actually pretty scarce.
The reason I'm asking is that I would like to iterate through a multi_array object using a single loop (I don't care about array indices). It seems like the library doesn't provide a way of iterating through an array using a single iterator. So, as a workaround, I'd like to reshape the array along a single dimension first (with other dimensions set to 1). Then I can iterate through the array using a single loop. However, I'm not sure how efficient the reshape() operation is.
Hence my second question: Is there an easy way to iterate through all the elements of a multi-array object using a single loop?
Below is the implementation of reshape function in multi_array_ref.hpp file.
template <typename SizeList>
void reshape(const SizeList& extents) {
boost::function_requires<
CollectionConcept<SizeList> >();
BOOST_ASSERT(num_elements_ ==
std::accumulate(extents.begin(),extents.end(),
size_type(1),std::multiplies<size_type>()));
std::copy(extents.begin(),extents.end(),extent_list_.begin());
this->compute_strides(stride_list_,extent_list_,storage_);
origin_offset_ =
this->calculate_origin_offset(stride_list_,extent_list_,
storage_,index_base_list_);
}
It looks like the function just re-indexes the elements in extents object associated with array size. The function is linear in the number of elements in extends. But I think it's complexity is constant in the total number of elements in the array.
I have two sets of pairs ( I cannot use c++11)
std::set<std::pair<int,int> > first;
std::set<std::pair<int,int> > second;
and I need to remove from first set all elements which are in second set(if first contain element from second to remove). I can do this by iterating through second set and if first contains same element erase from first element, but I wonder is there way to do this without iteration ?
If I understand correctly, basically you want to calculate the difference of first and second. There is an <algorithm> function for that.
std::set<std::pair<int, int>> result;
std::set_difference(first.begin(), first.end(), second.begin(), second.end(), inserter(result, result.end()));
Yes, you can.
If you want to remove, not just to detect, that is here another <algorithm> function: remove_copy_if():
http://www.cplusplus.com/reference/algorithm/remove_copy_if/
imho. It's not so difficult to understand how it works.
I wonder is there way to do this without iteration.
No. Internally, sets are balanced binary trees - there's no way to operate on them without iterating over the structure. (I assume you're interested in the efficiency of implementation, not the convenience in code, so I've deliberately ignored library routines that must iterates internally).
Sets are sorted though, so you could do an iterations over each, removing as you went (so # operations is the sum of set sizes) instead of an iteration and a lookup for each element (where number of operations is the number of elements you're iterating over times log base 2 of the number of elements in the other set). Only if one of your sets is much smaller than the other will the iterate/find approach will win out. If you look at the implementation of your library's set_difference function )mentioned in Amen's answer) - it should show you how to do the two iterations nicely.
If you want something more efficient, you need to think about how to achieve that earlier: for example, storing your pairs as flags in identically sized two-dimension matrix such that you can AND with the negation of the second set. Whether that's practical depends on the range of int values you're storing, whether the amount of memory needed is ok for your purposes....
Can I check in C(++) if an array is all 0 (or false) without iterating/looping over every single value and without allocating a new array of the same size (to use memcmp)?
I'm abusing an array of bools to have arbitrary large bitsets at runtime and do some bitflipping on it
You can use the following condition:
(myvector.end() == std::find(myvector.begin(), myvector.end(), true))
Obviously, internally, this loops over all values.
The alternative (which really should avoid looping) is to override all write-access functions, and keep track of whether true has ever been written to your vector.
UPDATE
Lie Ryan's comments below describe a more robust method of doing this, based on the same principle.
If it's not sorted, no. How would you plan on accomplishing that? You would need to inspect every element to see if it's 0 or not! memcmp, of course, would also check every element. It would just be much more expensive since it reads another array as well.
Of course, you can early-out as soon as you hit a non-0 element.
Your only option would be to use SIMD (which technically still checks every element, but using fewer instructions), but you generally don't do that in a generic array.
(Btw, my answer assumes that you have a simple static C/C++ array. If you can specify what kind of array you have, we could be more specific.)
If you know that this is going to be a requirement, you could build a data structure consisting of an array (possibly dynamic) and a count or currently non-zero cells. Obviously the setting of cells must be abstracted through, but that is natural in c++ with overloading, and you can use an opaque type in c.
Assume that you have an array of N element, you can do a bit check against a set of base vectors.
For example, you have a 15-element array you want to test.
You can test it against an 8-element zero array, an 4-element zero array, a 2-element zero array and a 1-element zero array.
You only have to allocate these elements once given that you know the maximum size of arrays you want to test. Furthermore, the test can be done in parallel (and with assembly intrinsic if necessary).
Further improvement in term of memory allocation can be done with using only an 8-element array since a 4-element zero array is simply the first half of the 8-element zero array.
Consider using boost::dynamic_bitset instead. It has a none member and several other std::bitset-like operations, but its length can be set at runtime.
No, you can compare arrays with memcmp, but you can't compare one value against a block of memory.
What you can do is use algorithms in C++ but that still involves a loop internally.
You don't have to iterate over the entire thing, just stop looping on the first non-zero value.
I can't think of any way to check a set of values other than inspecting them each in turn - you could play games with checking the underlying memory as something larger than bool (__int64 say) but alignment is then an issue.
EDIT:
You could keep a separate count of set bits, and check that is non-zero. You'd have to be careful about maintenance of this, so that setting a set bit did not ++ it and so on.
knittl,
I don't suppose you have access to some fancy DMA hardware on the target computer? Sometimes DMA hardware supports exactly the operation you require, i.e. "Is this region of memory all-zero?" This sort of hardware-accelerated comparison is a common solution when dealing with large bit-buffers. For example, some RAID controllers use this mechanism for parity checking.