Fast way to do lexicographical comparing 2 numbers - c++

I'm trying to sort a vector of unsigned int in lexicographical order.
The std::lexicographical_compare function only supports iterators so I'm not sure how to compare two numbers.
This is the code I'm trying to use:
std::sort(myVector->begin(),myVector->end(), [](const unsigned int& x, const unsigned int& y){
std::vector<unsigned int> tmp1(x);
std::vector<unsigned int> tmp2(y);
return lexicographical_compare(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end());
} );

C++11 introduces std::to_string
You can use from to_string as below:
std::sort(myVector->begin(),myVector->end(), [](const unsigned int& x, const unsigned int& y){
std::string tmp1 = std::to_string(x);
std::string tmp2 = std::to_string(y);
return lexicographical_compare(tmp1.begin(),tmp1.end(),tmp2.begin(),tmp2.end());
} );

I assume you have some good reasons, but allow me to ask: Why are you sorting two int's by using the std::lexicographical order? In which scenario is 0 not less than 1, for example?
I suggest for comparing the scalars you want to use std::less . Same as std lib itself does.
Your code (from the question) might contain a lambda that will use std::less and that will work perfectly. But let us go one step further and deliver some reusable code ready for pasting into your code. Here is one example:
/// sort a range in place
template< typename T>
inline void dbj_sort( T & range_ )
{
// the type of elements range contains
using ET = typename T::value_type;
// use of the std::less type
using LT = std::less<ET>;
// make its instance whose 'operator ()'
// we will use
LT less{};
std::sort(
range_.begin(),
range_.end(),
[&]( const ET & a, const ET & b) {
return less(a, b);
});
}
The above is using std::less<> internally. And it will sort anything that has begin() and end() and public type of the elements it contains. In other words implementation of the range concept.
Example usage:
std::vector<int> iv_ = { 13, 42, 2 };
dbj_sort(iv_);
std::array<int,3> ia_ = { 13, 42, 2 };
dbj_sort(ia_);
std:: generics in action ...
Why is std::less working here? Among other obvious things, because it compares two scalars. std::lexicographical_compare compares two ordinals.
std::lexicographical_compare might be used two compare two vectors, not two elements from one vector containing scalars.
HTH

Related

Spaceship operator on arrays

The following code is intended to implement comparison on an object that contains an array. Two objects should compare as <,==,> if all array elements compare like that. The following does not compile for a variety of reason:
#include <compare>
class witharray {
private:
array<int,4> the_array;
public:
witharray( array<int,4> v )
: the_array(v) {};
int size() { return the_array.size(); };
auto operator<=>( const witharray& other ) const {
array< std::strong_ordering,4 > cmps;
for (int id=0; id<4; id++)
cmps[id] = the_array[id]<=>other.the_array[id];
return accumulate
(cmps.begin(),cmps.end(),
std::equal,
[] (auto x,auto y) -> std::strong_ordering { return x and y; }
);
};
};
First of all, the array of comparisons:
call to implicitly-deleted default constructor of 'array<std::strong_ordering, 4>
Then the attempt to accumulate the comparisons:
no matching function for call to 'accumulate'
Compiler explorer: https://godbolt.org/z/E3ovh5qGa
Or am I completely on the wrong track?
Two objects should compare as <,==,> if all array elements compare like that.
This is a fairly interesting order. One thing to note here is that it's a partial order. That is, given {1, 2} vs {2, 1}, those elements aren't all < or == or >. So you're left with unordered.
C++20's comparisons do have a way to represent that: you have to return a std::partial_ordering.
The way that we can achieve this ordering is that we first compare the first elements, and then we ensure that all the other elements compare the same. If any pair of elements doesn't compare the same, then we know we're unordered:
auto operator<=>( const witharray& other ) const
-> std::partial_ordering
{
std::strong_ordering c = the_array[0] <=> other.the_array[0];
for (int i = 1; i < 4; ++i) {
if ((the_array[i] <=> other.the_array[i]) != c) {
return std::partial_ordering::unordered;
}
}
return c;
}
This has the benefit of not having to compare every pair of elements, since we might already know the answer by the time we get to the 2nd element (e.g. {1, 2, x, x} vs {1, 3, x, x} is already unordered, doesn't matter what the other elements are).
This seems like what you were trying to accomplish with your accumulate, except accumulate is the wrong algorithm here since we want to stop early. You'd want all_of in this case:
auto comparisons = views::iota(0, 4)
| views::transform([&](int i){
return the_array[i] <=> other.the_array[i];
});
bool all_match = ranges::all_of(comparisons | drop(1), [&](std::strong_ordering c){
return c == comparisons[0];
});
return all_match ? comparisons[0] : std::partial_ordering::unordered;
Which is admittedly awkward. In C++23, we can do the comparisons part more directly:
auto comparisons = views::zip_transform(
compare_three_way{}, the_array, other.the_array);
And then it would read better if you had a predicate like:
bool all_match = ranges::all_of(comparisons | drop(1), equals(comparisons[0]));
or wrote your own algorithm for this specific use-case (which is a pretty easy algorithm to write):
return all_same_value(comparisons)
? comparisons[0]
: std::partial_ordering::unordered;
Note that std::array already has spaceship operator which apparently does what you need:
class witharray {
private:
array<int, 4> the_array;
public:
witharray(array<int, 4> v)
: the_array(v) {};
int size() { return the_array.size(); };
auto operator<=>(const witharray& other) const
{
return the_array <=> other.the_array;
};
};
https://godbolt.org/z/4drddWa8G
Now to cover problems with your code:
array< std::strong_ordering, 4 > cmps; can't be initialized since there is no default value for std::strong_ordering
use of std::accumluate here is strange there is better algorithm for that: std::lexicographical_compare_three_way which was added to handle spaceship operator
You have feed std::equal to std::accumluate as binary operation when in fact this is algorithm to compare ranges (it accepts iterators). Most probably your plan here was to use std::equal_to.

how to sum up a vector of vector int in C++ without loops

I try to implement that summing up all elements of a vector<vector<int>> in a non-loop ways.
I have checked some relevant questions before, How to sum up elements of a C++ vector?.
So I try to use std::accumulate to implement it but I find it is hard for me to overload a Binary Operator in std::accumulate and implement it.
So I am confused about how to implement it with std::accumulate or is there a better way?
If not mind could anyone help me?
Thanks in advance.
You need to use std::accumulate twice, once for the outer vector with a binary operator that knows how to sum the inner vector using an additional call to std::accumulate:
int sum = std::accumulate(
vec.begin(), vec.end(), // iterators for the outer vector
0, // initial value for summation - 0
[](int init, const std::vector<int>& intvec){ // binaryOp that sums a single vector<int>
return std::accumulate(
intvec.begin(), intvec.end(), // iterators for the inner vector
init); // current sum
// use the default binaryOp here
}
);
In this case, I do not suggest using std::accumulate as it would greatly impair readability. Moreover, this function use loops internally, so you would not save anything. Just compare the following loop-based solution with the other answers that use std::accumulate:
int result = 0 ;
for (auto const & subvector : your_vector)
for (int element : subvector)
result += element;
Does using a combination of iterators, STL functions, and lambda functions makes your code easier to understand and faster? For me, the answer is clear. Loops are not evil, especially for such simple application.
According to https://en.cppreference.com/w/cpp/algorithm/accumulate , looks like BinaryOp has the current sum on the left hand, and the next range element on the right. So you should run std::accumulate on the right hand side argument, and then just sum it with left hand side argument and return the result. If you use C++14 or later,
auto binary_op = [&](auto cur_sum, const auto& el){
auto rhs_sum = std::accumulate(el.begin(), el.end(), 0);
return cur_sum + rhs_sum;
};
I didn't try to compile the code though :). If i messed up the order of arguments, just replace them.
Edit: wrong terminology - you don't overload BinaryOp, you just pass it.
Signature of std::accumulate is:
T accumulate( InputIt first, InputIt last, T init,
BinaryOperation op );
Note that the return value is deduced from the init parameter (it is not necessarily the value_type of InputIt).
The binary operation is:
Ret binary_op(const Type1 &a, const Type2 &b);
where... (from cppreference)...
The type Type1 must be such that an object of type T can be implicitly converted to Type1. The type Type2 must be such that an object of type InputIt can be dereferenced and then implicitly converted to Type2. The type Ret must be such that an object of type T can be assigned a value of type Ret.
However, when T is the value_type of InputIt, the above is simpler and you have:
using value_type = std::iterator_traits<InputIt>::value_type;
T binary_op(T,value_type&).
Your final result is supposed to be an int, hence T is int. You need two calls two std::accumulate, one for the outer vector (where value_type == std::vector<int>) and one for the inner vectors (where value_type == int):
#include <iostream>
#include <numeric>
#include <iterator>
#include <vector>
template <typename IT, typename T>
T accumulate2d(IT outer_begin, IT outer_end,const T& init){
using value_type = typename std::iterator_traits<IT>::value_type;
return std::accumulate( outer_begin,outer_end,init,
[](T accu,const value_type& inner){
return std::accumulate( inner.begin(),inner.end(),accu);
});
}
int main() {
std::vector<std::vector<int>> x{ {1,2} , {1,2,3} };
std::cout << accumulate2d(x.begin(),x.end(),0);
}
Solutions based on nesting std::accumulate may be difficult to understand.
By using a 1D array of intermediate sums, the solution can be more straightforward (but possibly less efficient).
int main()
{
// create a unary operator for 'std::transform'
auto accumulate = []( vector<int> const & v ) -> int
{
return std::accumulate(v.begin(),v.end(),int{});
};
vector<vector<int>> data = {{1,2,3},{4,5},{6,7,8,9}}; // 2D array
vector<int> temp; // 1D array of intermediate sums
transform( data.begin(), data.end(), back_inserter(temp), accumulate );
int result = accumulate(temp);
cerr<<"result="<<result<<"\n";
}
The call to transform accumulates each of the inner arrays to initialize the 1D temp array.
To avoid loops, you'll have to specifically add each element:
std::vector<int> database = {1, 2, 3, 4};
int sum = 0;
int index = 0;
// Start the accumulation
sum = database[index++];
sum = database[index++];
sum = database[index++];
sum = database[index++];
There is no guarantee that std::accumulate will be non-loop (no loops). If you need to avoid loops, then don't use it.
IMHO, there is nothing wrong with using loops: for, while or do-while. Processors that have specialized instructions for summing arrays use loops. Loops are a convenient method for conserving code space. However, there may be times when loops want to be unrolled (for performance reasons). You can have a loop with expanded or unrolled content in it.
With range-v3 (and soon with C++20), you might do
const std::vector<std::vector<int>> v{{1, 2}, {3, 4, 5, 6}};
auto flat = v | ranges::view::join;
std::cout << std::accumulate(begin(flat), end(flat), 0);
Demo

map comparator for pair of objects in c++

I want to use a map to count pairs of objects based on member input vectors. If there is a better data structure for this purpose, please tell me.
My program returns a list of int vectors. Each int vector is the output of a comparison between two int vectors ( a pair of int vectors). It is, however, possible, that the output of the comparison differs, though the two int vectors are the same (maybe in different order). I want to store how many different outputs (int vectors) each pair of int vectors has produced.
Assuming that I can access the int vector of my object with .inp()
Two pairs (a1,b1) and (a2,b2) should be considered equal, when (a1.inp() == a2.inp() && b2.inp() == b1.inp()) or (a1.inp() == b2.inp() and b1.inp() == a2.inp()).
This answer says:
The keys in a map a and b are equivalent by definition when neither a
< b nor b < a is true.
class SomeClass
{
vector <int> m_inputs;
public:
//constructor, setter...
vector<int> inp() {return m_inputs};
}
typedef pair < SomeClass, SomeClass > InputsPair;
typedef map < InputsPair, size_t, MyPairComparator > InputsPairCounter;
So the question is, how can I define equivalency of two pairs with a map comparator. I tried to concatenate the two vectors of a pair, but that leads to (010,1) == (01,01), which is not what I want.
struct MyPairComparator
{
bool operator() (const InputsPair & pair1, const InputsPair pair2) const
{
vector<int> itrc1 = pair1.first->inp();
vector<int> itrc2 = pair1.second->inp();
vector<int> itrc3 = pair2.first->inp();
vector<int> itrc4 = pair2.second->inp();
// ?
return itrc1 < itrc3;
}
};
I want to use a map to count pairs of input vectors. If there is a better data structure for this purpose, please tell me.
Using std::unordered_map can be considered instead due to 2 reasons:
if hash implemented properly it could be faster than std::map
you only need to implement hash and operator== instead of operator<, and operator== is trivial in this case
Details on how implement hash for std::vector can be found here. In your case possible solution could be to join both vectors into one, sort it and then use that method to calculate the hash. This is straightforward solution, but can produce to many hash collisions and lead to worse performance. To suggest better alternative would require knowledge of the data used.
As I understand, you want:
struct MyPairComparator
{
bool operator() (const InputsPair& lhs, const InputsPair pair2) const
{
return std::minmax(std::get<0>(lhs), std::get<1>(lhs))
< std::minmax(std::get<0>(rhs), std::get<1>(rhs));
}
};
we order the pair {a, b} so that a < b, then we use regular comparison.

Mapping combination of 4 integers to a single value

I have 4 separate integers that need to be mapped to an arbitrary, constant value.
For example, 4,2,1,1 will map to the number 42
And the number 4,2,1,2 will map to the number 86.
Is there anyway I can achieve this by using #define's or some sort of std::map. The concept seems very simple but for some reason I can't think of a good, efficient method of doing it. The methods I have tried are not working so I'm looking for some guidence on implementation of this.
Will a simple function suffice?
int get_magic_number( int a, int b , int c, int d)
{
if( (a==4)&&(b==2)&&(c==1)&&(d==1) ) return 42;
if( (a==4)&&(b==2)&&(c==1)&&(d==2) ) return 86;
...
throw SomeKindOfError();
}
Now that may look ugly, but you can easily create a macro to pretty it up. (Or a helper class or whatever... I'll just show the macro as I think its easy.
int get_magic_number( int a, int b , int c, int d)
{
#DEFINE MAGIC(A,B,C,D,X) if((a==(A))&&(b==(B))&&(c==(C))&&(d==(D))) return (X);
MAGIC(4,2,1,1, 42);
MAGIC(4,2,1,2, 86);
...
#UNDEF MAGIC
throw SomeKindOfError();
}
If you really care you can probably craft a constexpr version of this too, which you'll never be able to do with std::map based solutions.
Utilize a std::map<std::vector<int>, int>, so that the vector containing {4,2,1,1} will have the value 42, and so on.
Edit: I agree std::tuple would be a better way to go if you have a compiler with C++11 support. I used a std::vector because it is arguably more portable at this stage. You could also use a std::array<int, 4>.
If you do not have access to boost::tuple, std::tuple or std::array, you can implement a type holding 4 integers with a suitable less-than comparison satisfying strict weak ordering:
struct FourInts {
int a,b,c,d;
FourInts() : a(), b(), c(), d() {}
bool operator<(const FourInts& rhs) const {
// implement less-than comparison here
}
};
then use an std::map:
std::map<FourInts, int> m;
If you organise your ints in an array of standard library container, you can use std::lexicographical_compare for the less-than comparison.
If you know there's always 4 integers mapped to 1 integer I suggest you go with:
std::map< boost::tuple<int, int, int, int>, int >
Comparison (lexicographical) is already defined for tuples.

How to sort vector of pointer-to-struct

I'm trying to sort a concurrent_vector type, where hits_object is:
struct hits_object{
unsigned long int hash;
int position;
};
Here is the code I'm using:
concurrent_vector<hits_object*> hits;
for(i=0;...){
hits_object *obj=(hits_object*)malloc(sizeof(hits_object));
obj->position=i;
obj->hash=_prevHash[tid];
hits[i]=obj;
}
Now I have filled up a concurrent_vector<hits_object*> called hits.
But I want to sort this concurrent_vector on position property!!!
Here is an example of what's inside a typical hits object:
0 1106579628979812621
4237 1978650773053442200
512 3993899825106178560
4749 739461489314544830
1024 1629056397321528633
5261 593672691728388007
1536 5320457688954994196
5773 9017584181485751685
2048 4321435111178287982
6285 7119721556722067586
2560 7464213275487369093
6797 5363778283295017380
3072 255404511111217936
7309 5944699400741478979
3584 1069999863423687408
7821 3050974832468442286
4096 5230358938835592022
8333 5235649807131532071
I want to sort this based on the first column ("position" of type int). The second column is "hash" of type unsigned long int.
Now I've tried to do the following:
std::sort(hits.begin(),hits.end(),compareByPosition);
where compareByPosition is defined as:
int compareByPosition(const void *elem1,const void *elem2 )
{
return ((hits_object*)elem1)->position > ((hits_object*)elem2)->position? 1 : -1;
}
but I keep getting segmentation faults when I put in the line std::sort(hits.begin(),hits.end(),compareByPosition);
Please help!
Your compare function needs to return a boolean 0 or 1, not an integer 1 or -1, and it should have a strongly-typed signature:
bool compareByPosition(const hits_object *elem1, const hits_object *elem2 )
{
return elem1->position < elem2->position;
}
The error you were seeing are due to std::sort interpreting everything non-zero returned from the comp function as true, meaning that the left-hand side is less than the right-hand side.
NOTE : This answer has been heavily edited as the result of conversations with sbi and Mike Seymour.
int (*)(void*, void*) is the comparator for C qsort() function. In C++ std::sort() the prototype to the comparator is:
bool cmp(const hits_object* lhs, const hits_object* rhs)
{
return lhs->position < rhs->position;
}
std::sort(hits.begin(), hits.end(), &cmp);
On the other hand, you can use std::pair struct, which by default compares its first fields:
typedef std::pair<int position, unsigned long int hash> hits_object;
// ...
std::sort(hits.begin(), hits.end());
Without knowing what concurrent_vector is, I can't be sure what's causing the segmentation fault. Assuming it's similar to std::vector, you need to populate it with hits.push_back(obj) rather than hits[i] = j; you cannot use [] to access elements beyond the end of a vector, or to access an empty vector at all.
The comparison function should be equivalent to a < b, returning a boolean value; it's not a C-style comparison function returning negative, positive, or zero. Also, since sort is a template, there's no need for C-style void * arguments; everything is strongly typed:
bool compareByPosition(hits_object const * elem1, hits_object const * elem2) {
return elem1->position < elem2->position;
}
Also, you usually don't want to use new (and certainly never malloc) to create objects to store in a vector; the simplest and safest container would be vector<hits_object> (and a comparator that takes references, rather than pointers, as arguments). If you really must store pointers (because the objects are expensive to copy and not movable, or because you need polymorphism - neither of which apply to your example), either use smart pointers such as std::unique_ptr, or make sure you delete them once you're done with them.
The third argument you pass to std::sort() must have a signature similar to, and the semantics of, operator<():
bool is_smaller_position(const hits_object* lhs, const hits_object* rhs)
{
return lhs->position < rhs->position;
}
When you store pointers in a vector, you cannot overload operator<(), because smaller-than is fixed for all built-in types.
On a sidenote: Do not use malloc() in C++, use new instead. Also, I wonder why you are not using objects, rather than pointers. Finally, if concurrent_vector is anything like std::vector, you need to explicitly make it expand to accommodate new objects. This is what your code would then look like:
concurrent_vector<hits_object*> hits;
for(i=0;...){
hits_object obj;
obj.position=i;
obj.hash=_prevHash[tid];
hits.push_back(obj);
}
This doesn't look right:
for(i=0;...){
hits_object *obj=(hits_object*)malloc(sizeof(hits_object));
obj->position=i;
obj->hash=_prevHash[tid];
hits[i]=obj;
}
here you already are sorting the array based on 'i' because you set position to i as well as it becomes the index of hits!
also why using malloc, you should use new(/delete) instead. You could then create a simple constructor for the structure to initialize the hits_object
e.g.
struct hits_object
{
int position;
unsigned int hash;
hits_object( int p, unsigned int h ) : position(p), hash(h) {;}
};
then later write instead
hits_object* obj = new hits_object( i, _prevHash[tid] );
or even
hits.push_back( new hits_object( i, _prevHash[tid] ) );
Finally, your compare function should use the same data type as vector for its arguments
bool cmp( hits_object* p1, hits_object* p2 )
{
return p1->position < p2->position;
}
You can add a Lambda instead of a function to std::sort.
struct test
{
int x;
};
std::vector<test> tests;
std::sort(tests.begin(), tests.end(),
[](const test* a, const test* b)
{
return a->x < b->x;
});