How to uniquely characterize an array in c++? - c++

Is it possible to characterize an integer array in C++? Once characterized, arrays containing same set of elements will have same characteristics.
I was thinking on lines of hashcode, each hashcode will uniquely identify an array!
For example ary[]={4,5,3,2,4} and ary_two[]={4,4,2,3,5} should both have same characteristics/ hashcode!
I am trying to solve this question( asked in an interview ): A number of variable sized arrays are being generated. For each array determine if we have encountered an array before containing the same elements as this array!

Investigate std::hash. You can probably overload it to do what you want. For instance, if you want the arrays with values {4, 5, 3, 2, 4} and {4, 4, 2, 3, 5} to hash to the same value, you could specialize it like this:
template<> struct hash<std::array<int, 5>>
{
size_t operator()(const std::array<int, 5> &ary) const
{
return std::accumulate(std::begin(ary), std::end(ary), 0U) * 16777619;
}
};

One possible solution would be to use the elements hashes themselves (assuming the content of the array is hashable). Then just fold them together with some suitable function (e.g. an xor or better yet +). Make sure the folding function is commutative and associative, or the order of the array will make a difference.

Related

Add constant value to vector (or other data structure) in N(1)?

I am trying to figure out a problem. I have a vector with elements:
1, 2, 3, 4
and I am wondering if it is possible to increase (all or some) values at once in O(1) complexity.
for example I might want to add 3 so I have:
4, 5, 6, 7.
Is it even possible? Maybe you can give me other insight how to approach this problem.
Thank you in advance
If you mean O(1), then yes, it's possible.
1, 2, 3, 4 are just human conditional notations/symbols of volumes one, two, three, four. You can define the symbol 1 is four, 2 is five... Do you understand what I am getting at?
You can implement your own vector class, and override the constant vector's subscript operator for interpreting real values differently:
int operator[](size_t idx) const {
return v_[idx] + 3; // v_ is an encapsulated std::vector
}
// Or like below, but inheriting std containers is not considered
// a good practice.
int operator[](size_t idx) const {
return std::vector<int>::operator[](idx) + 3;
}
So, you fill your vector with the values 1, 2, 3, 4 and read the values 4, 5, 6, 7 from a const vector reference.

Index of the lower_bound of a value in sets and maps in c++

Suppose I want to get the index of the lower_bound of a value in a set and if i type
cout<<set.lower_bound(number)-set.begin()<<endl;
It is showing an error: no match for ‘operator-’
The same goes for maps as well,
However for arrays and vectors if i use
lower_bound(begin,end,val)-begin
it is showing the index
Why is that?
Yes, this because the operator - is not defined for the iterators of std::sets (bidirectional iterators) while it's defined for the arrays iterators (random access iterators).
Instead, you can use std::distance(), as follows
int main()
{
std::set<int> set {1, 2, 4, 5, 6};
int number = 3;
std::cout<<std::distance(set.begin(), set.lower_bound(number))<<std::endl;
}
And note that your sets will be ordered. I don't know what u expect.
And as john said there might be a design flaw. maybe you choose the wrong container for your purpose.

[C++][std::sort] How does it work on 2D containers?

I have this vector object that contains vector of ints
std::vector<std::vector<int>> vec;
I have been trying to figure out how std::sort(vec.begin(), vec.end()) works on it. Here are my observations:
2D vectors are sorted by size.
If some of inner vectors has the same size, the vector with lesser value of first element will have lesser index value.
I have been generating a few 2D vectors now, and it seems that these two are always true. However, I am doubting about my second assumption. Does std::sort really work this way, or it was just some luck that made my assumptions correct?
Sorting vector elements works the same way as sorting any other type. std::sort uses the comparison object given as an argument. If none was passed explicitly, std::less is the default.
std::less uses operator<. As per vector documentation, it:
Compares the contents of lhs and rhs lexicographically. The comparison is performed by a function equivalent to std::lexicographical_compare.
Lexicographical comparison is a operation with the following properties:
Two ranges are compared element by element.
The first mismatching element defines which range is lexicographically less or greater than the other.
If one range is a prefix of another, the shorter range is lexicographically less than the other.
If two ranges have equivalent elements and are of the same length, then the ranges are lexicographically equal.
An empty range is lexicographically less than any non-empty range.
Two empty ranges are lexicographically equal.
In short, lexicographical sorting is the same as sorting used for dictionaries (ignoring oddities of some languages).
2D vectors are sorted by size.
Not quite. {1}, {3, 4}, {1, 2, 5} would be sorted as {1}, {1, 2, 5}, {3, 4}.
std::sort uses operator < by default to sort. Since std::vector has an overloaded operator < it uses that. std::vector::operator < does a lexicographical compare meaning it returns the vector that has the first smaller element. That means {1, 1, 2} is less than {1, 1, 3} since the 2 is less than 3. If the vectors are of different length but the smaller one has the same elements that the larger one has then the smaller one is returned. That means that
int main()
{
std::vector a{5, 1}, b{10};
std::cout << (a < b);
}
Prints 1 since 5 is less than 10.
int main()
{
std::vector a{5, 10}, b{5};
std::cout << (a < b);
}
Prints 0 since a is larger than b but they have the same common element.

Sort merged array consisting of sorted arrays

I got two sorted arrays e.g. (3,4,5) and (1,3,7,8) and I got the combined sorted array (3,4,5,1,3,7,8).
Now I would like to sort the already combined array, without splitting it, but by overwriting it, by making use of the fact that it consists of 2 arrays which had already been sorted. Is there any way of doing this efficiently? I know there are a lot of threads about how to do this, by iterating through the sorted arrays and then putting the values into the new array accordingly, but I haven't seen this type of question anywhere yet. I would like to do this in c, but any help / pseudocode would be very kindly appreciated. Thanks!
Edit: The function which would do the sorting, would only be given the combined array and (maybe ) the length of the other two arrays if needed.
If you already have the original sorted arrays, the combined array (note it is not sorted) doesn't really help, except in that your destination storage is already allocated.
There's a well-known and very simple algorithm for merging two sorted ranges, but you can just use std::merge instead of coding it yourself.
Note that only works for non-overlapping input & output ranges: for your amended question, use std::inplace_merge, with the middle iterator set to the first element from your second sequence:
void sort_combined(int *array, size_t total, size_t first) {
std::inplace_merge(array, array + first, array + total);
}
// and use it like
int combined[] = {3, 4, 5, 1, 3, 7, 8};
const size_t first = 3;
const size_t second = 4;
const size_t total = 7; // == sizeof(combined)/sizeof(*combined)
sort_combined(combined, total, first);

Customizing compare in bsearch()

I have an array of addresses that point to integers ( these integers
are sorted in ascending order). They have duplicate values. Ex: 1,
2, 2, 3, 3, 3, 3, 4, 4......
I am trying to get hold of all the values that are greater than a
certain value(key). Currently trying to implement it using binary
search algo -
void *bsearch(
const void *key,
const void *base,
size_t num,
size_t width,
int ( __cdecl *compare ) ( const void *, const void *)
);
I am not able to achieve this completely, but for some of them.
Would there be any other way to get hold of all the values of the
array, with out changing the algorithm I am using?
As Klatchko and GMan have noted, the STL function gives you exactly what you're asking: std::upper_bound.
If you need to stick with bsearch, though, the simplest solution may be to iterate forwards until you reach a new value.
void* p = bsearch(key, base, num, width, compare);
while ((p != end) && // however you define the end of the array -
// base + num, perhaps?
(compare(key, p)==0)){ // while p points to an element matching the key
++p; // advance p
}
If you want to get the first p that matches key, rather than the first one that's larger, just use --p instead of ++p.
Whether you prefer this or a repeated binary search, as Michael suggests, depends on the size of the array and how many repetitions you expect.
Now, your question title refers to customizing the compare function, but as I understand the question that won't help you here - the compare function must compare any two equivalent objects as being equivalent, so it's no good for recognizing which of several equivalent objects is the first/last of its type in an array. Unless you had a different problem, specifically concerning the compare function?
You should look into std::upper_bound
For example, to find the address of the first value > 3:
const int data[] = { 1, 2, 2, 3, 3, 3, 3, 4, 4, ... };
size_t data_count = sizeof(data) / sizeof(*data);
const int *ptr = std::upper_bound(data, data + data_count, 3);
// ptr will now point to the address of the first 4
A related function is std::lower_bound.
Yes, you can use a binary search. The trick is what you do when you find an element with the given key... unless your lower and upper indices are the same, you need to continue binary searching in the left part of your interval... that is, you should move the upper bound to be the current midpoint. That way, when your binary search terminates, you will have found the first such element. Then just iterate over the rest.
If you have a binary search tree implemented, you have tree traversal algorithms to do this. You could reach the required 'upper-bound' node and simply traverse in-order from there. Traversal is simpler than searching the tree multiple times, i.e, traversing a tree of n nodes would take n operations at most, whereas searching n times would take (n.log n) operations.