Clang vector extensions and the equality operator in C++

Clang vector extensions and the equality operator in C++ - c++

I wrote a vector type using the Clang SIMD vector extensions. It works well, except when I need to check if two vectors are equal. The == operator doesn't seem to be defined correctly for Clang's vector types. Attempting to compare two vectors with == bizarrely seems to evaluate to a third vector of the same type as the two being compared, instead of a bool. I find this odd, since applying other operations like + or - compiles with no trouble, and outputs the expected results. Here's my code, compiled with Clang 3.5 (Xcode):
// in vect.h
template <typename NumericType>
using vec2 = NumericType __attribute__((ext_vector_type(2))) ;
//in main.cpp
#include "vect.h"
int main(int argc, const char ** argv) {
vec2<int> v0 {0, 1} ;
vec2<int> v1 {0, 1} ;
vec2<int> sumVs = v0 + v1 ; //OK: evaluates to {0, 2} when run
bool equal = (v0 == v1) ; /* Compiler error with message: "Cannot initialize
a variable of type 'bool' with an rvalue of type 'int __attribute__((ext_vector_type(2)))'" */
return 0;
}
Is there any way to enable using operator == with Clang's vector types, or any other workaround to this problem? Since they're considered primitive and not class types, I can't overload a comparison operator myself, and writing a global equals() function seems kludgy and inelegant.
Update: Or if no one has the solution I'm looking for, perhaps someone could explain the default behavior of the == operator when comparing two SIMD vectors?
Update #2: Hurkyl suggested == on two vectors does a vectorized comparison. I updated my code to test that possibility:
template <typename NumericType>
using vec3 = NumericType __attribute__((ext_vector_type(3))) ;
int main(int argc, const char ** argv) {
vec3<int> v0 {1, 2, 3} ;
vec3<int> v1 {3, 2, 1} ;
auto compareVs = (v0 == v1) ;
return 0;
}
LLDB reports the value of compareVs as {0, -1, 0}, which seems almost right if that's what's happening, but it seems weird that true would be -1, and false be 0.
Update #3: Ok, so thanks to the feedback I've gotten, I now have a better understanding of how relational and comparison operators are applied to vectors. But my basic problem remains the same. I need a simple and elegant way to check, for any two SIMD-type vectors v1 and v2, whether they are equivalent. In other words, I need to be able to check that for every index i in v1 and v2, v1[i] == v2[i], expressed as a single boolean value (that is, not as an vector/array of bool). If the only answer really is a function like:
template <typename NumericType>
bool equals(vec2<NumericType> v1, vec2<NumericType> v2) ...
... then I'll accept that. But I'm hoping someone can suggest something less clumsy.

If instead of using the compiler-specific language extensions you use the instrinsics (as provided, for example, in xmmintrin.h), then you can use
_mm_movemask_ps(__m128) and its relatives. For example
__m128 a,b;
/* some code to fill a,b with integer elements */
bool a_equals_b = 15 == _mm_movemask_ps(_mm_cmpeq_epi32(a,b));
This code works as follows. First, _mm_cmpeq_ps(a,b) generates another __m128 with each of the four elements to be either all bits 0 or all bits 1 – I presume operator== for the compiler-generated vector extensions calls exactly this intrinsic). Next, int _mm_movemask_ps(__m128) returns an integer with the kth bit set to the signbit of the kth element of its argument. Thus, if a==b for all elements, then _mm_movemask_ps(_mm_cmpeq_epi32(a,b)) returns 1|2|4|8=15.
I don't know the compiler-supported language extensions, but if you can obtain the underlying __m128 (for 128 bit wide vectors), then you can use this approach (possibly only the call to _mm_movemask_ps()).

Using the bitwise-complement of false as a true value isn't that unusual (see BASIC, for example).
It's particularly useful in vector arithmetic if you want to use it to implement to implement a branch-free ternary operator:
r = (a == c)? b: d
becomes
selector = (a == c)
r = (b & selector) | (d & ~selector)

Related

how to sort an array in descending order using a boolean function?

Since I am new in competitive programming so I am finding this a bit difficult. I encountered a code and I am not able to figure it out, need some help to understand it.
#include<iostream>
#include<algorithm>
using namespace std;
bool mycompare(int a ,int b){
return a>b;
}
int main(){
int a[]={5,4,3,1,2,6,7};
int n =sizeof(a)/sizeof(int);
sort(a,a+n,mycompare);
for(int i=0; i<n;i++){
cout<<a[i]<<"";
}
return 0;
}
output:
7 6 5 4 3 2 1
How does this code work more specifically what does the mycompare function do in the code?
My doubt is that why haven't we passed any arguments in the mycompare() function inside the main() function since the prototype of the function is
bool mycompare(int a, int b);

A comparison-based sorting algorithm sorts the elements solely by pair-wise comparison, i.e., if a < b holds, then a has to be placed before b.
This is a fine approach, but if you limit yourself to using <, it only allows you to sort elements in an ascending order. What if you want to have them in descending order, or any other ordering? This is where the concept of a Comparator (or a Compare callable in the context of the C++ standard) comes into play: It is a binary predicate bool compare(element a, element b) that is supposed to replace the < operator, i.e., a < b becomes compare(a, b) instead. This generalization allows you to encapsulate all types of orderings, in your question you already provided an example where the comparison uses a greater-than operator >, which gives you the aforementioned descending sorted order.
As for how this works internally in C++, the details can be rather complicated, but you can look at it as this:
mycompare without any parameters is a function pointer, i.e. a pointer to the memory address where the machine code for mycompare starts. You can do something like
auto func_pointer = mycompare;
func_pointer(1, 2); // calls mycompare(1, 2)
By giving this function pointer as a parameter to std::sort, you replace the default < comparison function by your own. The way C++ works internally gives the additional advantage that this function call can most likely be inlined, i.e., the compiler avoids the function call can be avoided by copying the code from mycompare into the std::sort invocation, which can speed up your code significantly.

std::sort takes a RandomIt (random iterator) as the first and second arguments that must satisfy the requirements of ValueSwappable and LegacyRandomAccessIterator. Instead of using a Plain-Old-Array of int, you want to use std::array which can then provide the iterators with the member functions .begin() and .end().
Using a proper container from the C++ standard template library makes sorting with std::sort trivial. You need not even provide a custom compare function to sort in descending order as std::less<int>() is provided for you (though your purpose may be to provide the compare function)
Your prototype for mycompare will work fine as is, but preferably the parameters are const type rather than just type, e.g.
bool mycompare(const int a, const int b)
{
return a > b;
}
The implementation using the array container is quite trivial. Simply declare/initialize your array a and then call std::sort (a.begin(), a.end(), mycompare); A complete working example would be:
#include <iostream>
#include <algorithm>
#include <array>
bool mycompare(const int a, const int b)
{
return a > b;
}
int main (void) {
std::array<int, 7> a = { 5, 4, 3, 1, 2, 6, 7 };
std::sort (a.begin(), a.end(), mycompare);
for (auto& i : a)
std::cout << " " << i;
std::cout << '\n';
}
Example Use/Output
$ ./bin/array_sort
7 6 5 4 3 2 1
Sorting the Plain Old Array*
If you must use a Plain-Old-Array, then you can use plain-old-pointers as your random iterrators. While not a modern C++ approach, you can handle the plain-old-array with std::sort. You can make use of the builtin std::greater<type>() for a descending sort or std::less<type>() for an ascending sort.
An implementation using pointers would simply be:
#include <iostream>
#include <algorithm>
int main (void) {
int a[] = { 5, 4, 3, 1, 2, 6, 7 };
size_t n = sizeof a / sizeof *a;
#if defined (ASCEND)
std::sort (a, a + n, std::less<int>());
#else
std::sort (a, a + n, std::greater<int>());
#endif
for (size_t i = 0; i < n; i++)
std::cout << " " << a[i];
std::cout << '\n';
}
(same output unless -DASCEND is added as a define on the commandline, and then an ascending sort will result from the use of std::less<int>())
Look things over and let me know if you have further questions.

how to sum up a vector of vector int in C++ without loops

I try to implement that summing up all elements of a vector<vector<int>> in a non-loop ways.
I have checked some relevant questions before, How to sum up elements of a C++ vector?.
So I try to use std::accumulate to implement it but I find it is hard for me to overload a Binary Operator in std::accumulate and implement it.
So I am confused about how to implement it with std::accumulate or is there a better way?
If not mind could anyone help me?
Thanks in advance.

You need to use std::accumulate twice, once for the outer vector with a binary operator that knows how to sum the inner vector using an additional call to std::accumulate:
int sum = std::accumulate(
vec.begin(), vec.end(), // iterators for the outer vector
0, // initial value for summation - 0
[](int init, const std::vector<int>& intvec){ // binaryOp that sums a single vector<int>
return std::accumulate(
intvec.begin(), intvec.end(), // iterators for the inner vector
init); // current sum
// use the default binaryOp here
}
);

In this case, I do not suggest using std::accumulate as it would greatly impair readability. Moreover, this function use loops internally, so you would not save anything. Just compare the following loop-based solution with the other answers that use std::accumulate:
int result = 0 ;
for (auto const & subvector : your_vector)
for (int element : subvector)
result += element;
Does using a combination of iterators, STL functions, and lambda functions makes your code easier to understand and faster? For me, the answer is clear. Loops are not evil, especially for such simple application.

According to https://en.cppreference.com/w/cpp/algorithm/accumulate , looks like BinaryOp has the current sum on the left hand, and the next range element on the right. So you should run std::accumulate on the right hand side argument, and then just sum it with left hand side argument and return the result. If you use C++14 or later,
auto binary_op = [&](auto cur_sum, const auto& el){
auto rhs_sum = std::accumulate(el.begin(), el.end(), 0);
return cur_sum + rhs_sum;
};
I didn't try to compile the code though :). If i messed up the order of arguments, just replace them.
Edit: wrong terminology - you don't overload BinaryOp, you just pass it.

Signature of std::accumulate is:
T accumulate( InputIt first, InputIt last, T init,
BinaryOperation op );
Note that the return value is deduced from the init parameter (it is not necessarily the value_type of InputIt).
The binary operation is:
Ret binary_op(const Type1 &a, const Type2 &b);
where... (from cppreference)...
The type Type1 must be such that an object of type T can be implicitly converted to Type1. The type Type2 must be such that an object of type InputIt can be dereferenced and then implicitly converted to Type2. The type Ret must be such that an object of type T can be assigned a value of type Ret.
However, when T is the value_type of InputIt, the above is simpler and you have:
using value_type = std::iterator_traits<InputIt>::value_type;
T binary_op(T,value_type&).
Your final result is supposed to be an int, hence T is int. You need two calls two std::accumulate, one for the outer vector (where value_type == std::vector<int>) and one for the inner vectors (where value_type == int):
#include <iostream>
#include <numeric>
#include <iterator>
#include <vector>
template <typename IT, typename T>
T accumulate2d(IT outer_begin, IT outer_end,const T& init){
using value_type = typename std::iterator_traits<IT>::value_type;
return std::accumulate( outer_begin,outer_end,init,
[](T accu,const value_type& inner){
return std::accumulate( inner.begin(),inner.end(),accu);
});
}
int main() {
std::vector<std::vector<int>> x{ {1,2} , {1,2,3} };
std::cout << accumulate2d(x.begin(),x.end(),0);
}

Solutions based on nesting std::accumulate may be difficult to understand.
By using a 1D array of intermediate sums, the solution can be more straightforward (but possibly less efficient).
int main()
{
// create a unary operator for 'std::transform'
auto accumulate = []( vector<int> const & v ) -> int
{
return std::accumulate(v.begin(),v.end(),int{});
};
vector<vector<int>> data = {{1,2,3},{4,5},{6,7,8,9}}; // 2D array
vector<int> temp; // 1D array of intermediate sums
transform( data.begin(), data.end(), back_inserter(temp), accumulate );
int result = accumulate(temp);
cerr<<"result="<<result<<"\n";
}
The call to transform accumulates each of the inner arrays to initialize the 1D temp array.

To avoid loops, you'll have to specifically add each element:
std::vector<int> database = {1, 2, 3, 4};
int sum = 0;
int index = 0;
// Start the accumulation
sum = database[index++];
sum = database[index++];
sum = database[index++];
sum = database[index++];
There is no guarantee that std::accumulate will be non-loop (no loops). If you need to avoid loops, then don't use it.
IMHO, there is nothing wrong with using loops: for, while or do-while. Processors that have specialized instructions for summing arrays use loops. Loops are a convenient method for conserving code space. However, there may be times when loops want to be unrolled (for performance reasons). You can have a loop with expanded or unrolled content in it.

With range-v3 (and soon with C++20), you might do
const std::vector<std::vector<int>> v{{1, 2}, {3, 4, 5, 6}};
auto flat = v | ranges::view::join;
std::cout << std::accumulate(begin(flat), end(flat), 0);
Demo

C++ class design: dynamic typing alternative to template argument?

I would like to build a space-efficient modular arithmetic class. The idea is that the modulus M is an immutable attribute that gets fixed during instantiation, so if we have a large array (std::vector or another container) of values with the same M, M only needs to be stored once.
If M can be fixed at compile time, this can be done using templates:
template <typename num, num M> class Mod_template
{
private:
num V;
public:
Mod_template(num v=0)
{
if (M == 0)
V = v;
else
{
V = v % M;
if (V < 0)
V += M;
}
}
// ...
};
Mod_template<int, 5> m1(2); // 2 mod 5
However, in my application, we should be able to express M runtime. What I have looks like this:
template <typename num> class Mod
{
private:
const num M;
num V;
public:
Mod(num m, num v=0): M(abs(m))
{
if (M == 0)
V = v;
else
{
V = v % M;
if (V < 0)
V += M;
}
}
// ...
};
Mod<int> m2(5, 2); // 2 mod 5
Mod<int> m3(3); // 0 mod 3
This works, but a large vector of mod M values uses 2x the space it needs to.
I think the underlying conceptual problem is that Mod's of different moduli are syntactically of the same type even though they "should" be different types. For example, a statement like
m2 = m3;
should raise a runtime error "naturally" (in my version, it does so "manually": check is built into the copy constructor, as well as every binary operator I implement).
So, is there a way to implement some kind of dynamic typing so that the Mod object's type remembers the modulus? I'd really appreciate any idea how to solve this.
This is a recurring problem for me with various mathematical structures (e.g. storing many permutations on the same set, elements of the same group, etc.)
EDIT: as far as I understand,
templates are types parametrized by a class or literal.
what I want: a type parametrized by a const object (const num in this case, const Group& or const Group *const for groups, etc.).
Is this possible?

It will be difficult to do it in zero storage space if the class needs to know what M should be without any outside help. Likely the best you can do is store a pointer to a shared M, which may be a little better depending on how large num is. But it's not as good as free.
It will be easier to design if M is a passed-in value to all the functions that need it. Then you can do things like make a pool of objects that all share the same M (there are plenty of easy ways to design this; e.g. map<num, vector<num> >) and only store M once for the pool. The caller will need to know which pool the Mod object came from, but that's probably something it knows anyway.
It's hard to answer this question perfectly in isolation... knowing more about the calling code would definitely help you get better answers.

How can I combine some functors to generate a `isOdd` function?

How can I combine several functors to generate a isOdd functor?
equal_to
modulus
bind2nd
...
int nums[] = {0, 1, 2, 3, 4};
vector<int> v1(nums, nums+5), v2;
remove_copy_if(v1.begin(), v1.end(), back_inserter(v2), isOdd);
v2 => {0, 2, 4}

Using just the primitives provided in the standard libraries, this is actually surprisingly difficult because the binders provided by bind1st and bind2nd don't allow you to compose functions. In this particular case, you're trying to check if
x % 2 == 1
Which, if you consider how the <functional> primitives work, is equivalent to
equal_to(modulus(x, 2), 1)
The problem is that the components in <functional> don't allow you to pass the output of one function as the input to another function very easily. Instead, you'll have to rely on some other technique. In this case, you could cheat by using two successive applications of not1:
not1(not1(bind2nd(modulus<int>(), 2)))
This works because it's equivalent to
!!(x % 2)
If x is even, then this is !!0, which is false, and if x is odd this is !!1, which is true. The reason for the double-filtering through not1 is to ensure that the result has type bool rather than int, since
bind2nd(modulus<int>(), 2)
is a function that produces an int rather than the bool you want.

isOdd can be defined as:
bind2nd(modulus<int>(),2)

How to compare vectors with Boost.Test?

I am using Boost Test to unit test some C++ code.
I have a vector of values that I need to compare with expected results, but I don't want to manually check the values in a loop:
BOOST_REQUIRE_EQUAL(values.size(), expected.size());
for( int i = 0; i < size; ++i )
{
BOOST_CHECK_EQUAL(values[i], expected[i]);
}
The main problem is that the loop check doesn't print the index, so it requires some searching to find the mismatch.
I could use std::equal or std::mismatch on the two vectors, but that will require a lot of boilerplate as well.
Is there a cleaner way to do this?

Use BOOST_CHECK_EQUAL_COLLECTIONS. It's a macro in test_tools.hpp that takes two pairs of iterators:
BOOST_CHECK_EQUAL_COLLECTIONS(values.begin(), values.end(),
expected.begin(), expected.end());
It will report the indexes and the values that mismatch. If the sizes don't match, it will report that as well (and won't just run off the end of the vector).
Note that if you want to use BOOST_CHECK_EQUAL or BOOST_CHECK_EQUAL_COLLECTIONS with non-POD types, you will need to implement
bool YourType::operator!=(const YourType &rhs) // or OtherType
std::ostream &operator<<(std::ostream &os, const YourType &yt)
for the comparison and logging, respectively.
The order of the iterators passed to BOOST_CHECK_EQUAL_COLLECTIONS determines which is the RHS and LHS of the != comparison - the first iterator range will be the LHS in the comparisons.

A bit off-topic, however, when sometimes one needs to compare collections of floating-point numbers using comparison with tolerance then this snippet may be of use:
// Have to make it a macro so that it reports exact line numbers when checks fail.
#define CHECK_CLOSE_COLLECTION(aa, bb, tolerance) { \
using std::distance; \
using std::begin; \
using std::end; \
auto a = begin(aa), ae = end(aa); \
auto b = begin(bb); \
BOOST_REQUIRE_EQUAL(distance(a, ae), distance(b, end(bb))); \
for(; a != ae; ++a, ++b) { \
BOOST_CHECK_CLOSE(*a, *b, tolerance); \
} \
}
This does not print the array indexes of mismatching elements, but it does print the mismatching values with high precision, so that they are often easy to find.
Example usage:
auto mctr = pad.mctr();
std::cout << "mctr: " << io::as_array(mctr) << '\n';
auto expected_mctr{122.78731602430344,-13.562000155448914};
CHECK_CLOSE_COLLECTION(mctr, expected_mctr, 0.001);

Since Boost 1.59 it is much easier to compare std::vector instances. See this documentation for version 1.63 (which is nearly equal in this respect to 1.59).
For example if you have declared std::vector<int> a, b; you can write
BOOST_TEST(a == b);
to get a very basic comparison. The downside of this is that in case of failure Boost only tells you that a and b are not the same. But you get more info by comparing element-wise which is possible in an elegant way
BOOST_TEST(a == b, boost::test_tools::per_element() );
Or if you want a lexicographic comparison you may do
BOOST_TEST(a <= b, boost::test_tools::lexicographic() );

How about BOOST_CHECK_EQUAL_COLLECTIONS?
BOOST_AUTO_TEST_CASE( test )
{
int col1 [] = { 1, 2, 3, 4, 5, 6, 7 };
int col2 [] = { 1, 2, 4, 4, 5, 7, 7 };
BOOST_CHECK_EQUAL_COLLECTIONS( col1, col1+7, col2, col2+7 );
}
example
Running 1 test case...
test.cpp(11): error in "test": check { col1, col1+7 } == { col2, col2+7 } failed.
Mismatch in a position 2: 3 != 4
Mismatch in a position 5: 6 != 7
* 1 failure detected in test suite "example"

You can use BOOST_REQUIRE_EQUAL_COLLECTIONS with std::vector<T>, but you have to teach Boost.Test how to print a std::vector when you have a vector of vectors or a map whose values are vectors. When you have a map, Boost.Test needs to be taught how to print std::pair. Since you can't change the definition of std::vector or std::pair, you have to do this in such a way that the stream insertion operator you define will be used by Boost.Test without being part of the class definition of std::vector. Also, this technique is useful if you don't want to add stream insertion operators to your system under test just to make Boost.Test happy.
Here is the recipe for any std::vector:
namespace boost
{
// teach Boost.Test how to print std::vector
template <typename T>
inline wrap_stringstream&
operator<<(wrap_stringstream& wrapped, std::vector<T> const& item)
{
wrapped << '[';
bool first = true;
for (auto const& element : item) {
wrapped << (!first ? "," : "") << element;
first = false;
}
return wrapped << ']';
}
}
This formats the vectors as [e1,e2,e3,...,eN] for a vector with N elements and will work for any number of nested vectors, e.g. where the elements of the vector are also vectors.
Here is the similar recipe for std::pair:
namespace boost
{
// teach Boost.Test how to print std::pair
template <typename K, typename V>
inline wrap_stringstream&
operator<<(wrap_stringstream& wrapped, std::pair<const K, V> const& item)
{
return wrapped << '<' << item.first << ',' << item.second << '>';
}
}
BOOST_REQUIRE_EQUAL_COLLECTIONS will tell you the index of the mismatched items, as well as the contents of the two collections, assuming the two collections are of the same size. If they are of different sizes, then that is deemed a mismatch and the differing sizes are printed.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Clang vector extensions and the equality operator in C++ - c++

Related

how to sort an array in descending order using a boolean function?

how to sum up a vector of vector int in C++ without loops

C++ class design: dynamic typing alternative to template argument?

How can I combine some functors to generate a `isOdd` function?

How to compare vectors with Boost.Test?

Categories

Resources