Getting minimum values and index from a matrix without repeating index C++ - c++

I have a vector< vector <int> > matrix of size n and I want to get the minimum value for each i and it indexs [i][j] and put it on a vector but I don't want to get any indexs repeated.
I've found a theoretical way but I cannot write it in code.
Make 2 vectors U←{1,...,n}, L←{1,...,n}
Repeat n times
Be (u,l)∈U×L from matrix[u,l] ≤ matrix[i,j], ∀i∈U, ∀j∈L
S[u] ← l
Do U←U-{u} y L←L-{l}

You can code this algorithm directly
typedef vector<vector<int>> Matrix;
typedef pair<size_t, size_t> Index;
typedef vector<Index> IndexList;
IndexList MinimalSequence(const Matrix& matrix) {
IndexList result;
set<size_t> U, L;
for (size_t i = 0; i < matrix.size(); ++i) { // consider square
U.insert(i);
L.insert(i);
}
while (U.size()) { // same as L.size()
int min = numeric_limits<int>::max();
Index minIndex;
for (auto u: U)
for (auto l: L)
if (matrix[u][l] < min) {
minIndex = make_pair(u, l);
min = matrix[u][l];
}
U.erase(minIndex.first);
L.erase(minIndex.second);
result.push_back(minIndex);
}
return result;
}
also your question is not clear in this way: do you want to start from the overall smallest element of the matrix (as your formula said) and then move to the next smallest?
or do you want to move through the columns from left to right? I implemented it according to formulas.
Note that set of non-negative integers in your formula is set<size_t> on which insert() and erase() are available. For all is while-loop
I would also suggest to try alternative algorithm - sort a list of matrix indices by there corresponding values and then iterate over it removing indices you dont want anymore.
edit: code actually differs from algorithm in few ways to be precise. That seemed more practical.
process is repeated until set of indices is exhausted - that is equal to n
return structure is list of 2d indices and encodes more information than array

You already have accepted an answer, but aren't you facing the Assignment problem that can be solved using the Hugarian algorithm, and maybe even more efficient algorithms that exists and are already implemented?

Related

Correct datastructure for user specified integer mapping into rows of a matrix

I have a C++ class that manipulates an NxM matrix. The rows individually are meaningful, but the C++ contiguous indexing [0,1,2,...,N-1] is not. The users find it preferable to choose an indexing which has meaning to them, e.g., for a 3 row matrix, the user may wish to have the integer -3 label row zero, -1 label row 1, and 3 label row 2.
I may assume that 1) the labels are integers, and 2) the labels are monotonically increasing, and 3) the number of rows is not huge. I may not assume the labels are continuous, or even gapped with even spacing. The pseudocode is below:
template<typename T>
class Foo {
public:
Foo(std::vector<int> labels, int columns) {
m_.resize(labels.size()*columns);
}
void update(int label, T value) {
// map label to index, update the entry in the matrix:
int idx = ...;
m_[idx] = value;
}
std::vector<T> get_row(int label) {
// Map label
}
private:
// A matrix:
std::vector<T> m_;
// What datastructure should I use here?
SomeDataStructure label_to_row_;
};
The call to update must be extremely fast. What is the best datastructure to use to quickly map the label to the row of the matrix?
Theoretically speaking, hash maps are the fastest containers for what you're trying to achieve (with O(1)) complexity. But in practice, there are a couple of things you can do.
First of all, you can have multiple implementations using different data structures and choose to return one of these based on the given indices at runtime (using abstract classes or other similar ways). You can do this on the structures I propose below and choose one at runtime.
If you know that the range of data is small (or you can detect it at runtime), Then the problem is easy. Just create a vector that has the same size as the range of data and set the ordered index in this vector:
std::vector<int> indices = {/*data*/};
auto minmax = std::minmax_element(indices.begin(), indices.end());
int min = *minmax.first, max = *minmax.second, range = max - min;
std::vector<int> index_map(range);
for (size_t i = 0; i < indices.size(); ++i) index_map[indices[i] - min] = i;
I hope you got what I'm trying to say because I feel like I didn't explain it very well.
If your range of data is large but the minimum spacing between them is also larger than 1, then you can do the previous method with a small modification:
std::vector<int> indices = {/*data*/};
auto minmax = std::minmax_element(indices.begin(), indices.end());
int min = *minmax.first, max = *minmax.second, range = max - min;
// Assuming indices are sorted
int diff = std::numeric_limits<int>::max();
for (size_t i = 0; i < indices.size() - 1; ++i) diff = std::min(diff, indices[i] - indices[i + 1]);
// diff can't be zero
std::vector<int> index_map(range);
for (size_t i = 0; i < indices.size(); ++i) index_map[(indices[i] - min) / diff] = i;
Here we find the minimum spacing between indices and divide by that.
Use an optimized 3rd party map that is optimized further (using vectorization, multi-threading, and other methods) like these.
Maybe you can try to use a weaker but faster hash function since the number of indices are not large.
I'll add to the list if I think of anything else.

Most efficient algorithm for Two-sum problem (involving indices)

The problem statement is given an array and a given sum "T", find all the pairs of indices of the elements in the array which add up to T. Additional requirements/constraints:
Indexing starts from 0
The indices must be displayed with lower index first (Ex: 24, 30 instead of 30, 24)
The indices must be displayed in ascending order (Ex: if we find (1,3), (0,2) and (5,8) the output must be (0,2) (1,3) (5,8)
There can be duplicate elements in the array, which also have to be considered
Here's my code in C++, I used the hash-table approach using unordered_set:
void Twosum(vector <int> res, int T){
int temp; int ti = -1;
unordered_set<int> s;
vector <int> res2 = res; //Just a copy of the input vector
vector <tuple<int, int>> indices; //Result to be output
for (int i = 0; i < (int)res.size(); i++){
temp = T - res[i];
if (s.find(temp) != s.end()){
while(ti < (int)res.size()){ //While loop for finding all the instances of temp in the array,
//not part of the original hash-table algorithm, something I added
ti = find(res2.begin(), res2.end(), temp) - res2.begin();
//Here find() takes O(n) time which is an issue
res2[ti] = lim; //To remove that instance of temp so that new instances
//can be found in the while loop, here lim = 10^9
if(i <= ti) indices.push_back(make_tuple(i, ti));
else indices.push_back(make_tuple(ti, i));
}
}
s.insert(res[i]);
}
if(ti == -1)
{cout<<"-1 -1"; //if no indices were found
return;}
sort(indices.begin(), indices.end()); //sorting since unordered_set stores elements randomly
for(int i=0; i<(int)indices.size(); i++)
cout<<get<0>(indices[i])<<" "<<get<1>(indices[i])<<endl;
}
This has multiple issues:
firstly that while loop doesn't work as intended, instead it shows SIGABRT error (free(): invalid pointer). The ti index is also somehow going beyond the vector bounds, even though I have that check in the while loop.
Secondly the find() function works in O(n) time, which increases the overall complexity to O(n^2), which is causing my program to timeout during execution. However that function is required since we have to output indices.
Lastly this unordered-set implementation doesn't seem to work when there are many duplicate elements in the array (since sets only take unique elements), which is one of the main constraints of the problem. This makes me think we need some sort of hash function or hashmap to deal with the duplicates? I'm not sure...
All the different algorithms I've found for this on the internet have dealt with just printing the elements and not the indices, hence I've had no luck with this problem.
If any of you know an optimal algorithm for this while also satisfying the constraints and running under O(n) time, your help would be highly appreciated. Thank you in advance.
Here is a pseudo-code answering your question, using hash tables (or maps) and set. I let you translate this to cpp using adapted data structures (in this case, classic hashmaps and sets will do the job well).
Notations: we will denote A the array, n its length, and T the "sum".
// first we build a map element -> {set of indices corresponding to this element}
Let M be an empty map; // or hash map, or hash table, or dictionary
for i from 0 to n-1 do {
Let e = A[i];
if e is not a key of M then {
M[e] = new_set()
}
M[e].add(i)
}
// Now we iterate over the elements
for each key e of M do {
if T-e is a key of M then {
display_combinations(M[e], M[T-e]);
}
}
// The helper function display_combinations
function display_combinations(set1, set2) {
for each element e1 of set1 do {
for element e2 of set2 do {
if e1 < e2 then {
display "(e1, e2)";
} else if e1 > e2 then {
display "(e2, e1)";
}
}
}
}
As said in the comments, the complexity in the worst case of this algorithm is in O(n²). A way to see that we cannot go below this complexity is that the size of the output may be in O(n²), in the case where all elements of the array have the value T/2.
Edit: this pseudo code does not output the pairs in the order. Just store them in an array of pairs, and sort this array before displaying it. Same, I did not treat the case where a pair (i, i) may satisfy the requirement. You may have to consider it (just change e1 > e2 by e1 >= e2 in the last loop)

Sort array of n elements which has k sorted sections

What is the best way to sort an section-wise sorted array as depicted in the second image?
The problem is performing a quick-sort using Message Passing Interface. The solution is performing quick-sort on array sections obtained by using MPI_Scatter() then joining the sorted
pieces using MPI_Gather().
Problem is that the array as a whole is unsorted but sections of it are.
Merging the sub-sections similarly to this solution seems like the best way of sorting the array, but considering that the sub-arrays are already within a single array other sorting algorithms may prove better.
The inputs for a sort function would be the array, it's length and the number of equally sorted sub-sections.
A signature would look something like int* sort(int* array, int length, int sections);
The sections parameter can have any value between 1 and 25. The length parameter value is greater than 0, a multiple of sections and smaller than 2^32.
This is what I am currently using:
int* merge(int* input, int length, int sections)
{
int* sub_sections_indices = new int[sections];
int* result = new int[length];
int section_size = length / sections;
for (int i = 0; i < sections; i++) //initialisation
{
sub_sections_indices[i] = 0;
}
int min, min_index, current_index;
for (int i = 0; i < length; i++) //merging
{
min_index = 0;
min = INT_MAX;
for (int j = 0; j < sections; j++)
{
if (sub_sections_indices[j] < section_size)
{
current_index = j * section_size + sub_sections_indices[j];
if (input[current_index] < min)
{
min = input[current_index];
min_index = j;
}
}
}
sub_sections_indices[min_index]++;
result[i] = min;
}
return result;
}
Optimizing for performance
I think this answer that maintains a min-heap of the smallest item of each sub-array is the best way to handle arbitrary input. However, for small values of k, think somewhere between 10 and 100, it might be faster to implement the more naive solutions given in the question you linked to; while maintaining the min-heap is only O(log n) for each step, it might have a higher overhead for small values of n than the simple linear scan from the naive solutions.
All these solutions create a copy of the input, and they maintain O(k) state.
Optimizing for space
The only way to save space I see is to sort in-place. This will be a problem for the algorithms mentioned above. An in-place algorithm will have two swap elements, but any swaps will likely destroy the property that each sub-array is sorted, unless the larger of the swapped pair is re-sorted into the sub-array it is being swapped to, which will result in an O(n²) algorithm. So if you really do need to conserve memory, I think a regular in-place sorting algorithm would have to be used, which defeats your purpose.

choose n largest elements in two vector

I have two vectors, each contains n unsorted elements, how can I get n largest elements in these two vectors?
my solution is merge two vector into one with 2n elements, and then use std::nth_element algorithm, but I found that's not quite efficient, so anyone has more efficient solution. Really appreciate.
You may push the elements into priority_queue and then pop n elements out.
Assuming that n is far smaller than N this is quite efficient. Getting minElem is cheap and sorted inserting in L cheaper than sorting of the two vectors if n << N.
L := SortedList()
For Each element in any of the vectors do
{
minElem := smallest element in L
if( element >= minElem or if size of L < n)
{
add element to L
if( size of L > n )
{
remove smallest element from L
}
}
}
vector<T> heap;
heap.reserve(n + 1);
vector<T>::iterator left = leftVec.begin(), right = rightVec.begin();
for (int i = 0; i < n; i++) {
if (left != leftVec.end()) heap.push_back(*left++);
else if (right != rightVec.end()) heap.push_back(*right++);
}
if (left == leftVec.end() && right == rightVec.end()) return heap;
make_heap(heap.begin(), heap.end(), greater<T>());
while (left != leftVec.end()) {
heap.push_back(*left++);
push_heap(heap.begin(), heap.end(), greater<T>());
pop_heap(heap.begin(), heap.end(), greater<T>());
heap.pop_back();
}
/* ... repeat for right ... */
return heap;
Note I use *_heap directly rather than priority_queue because priority_queue does not provide access to its underlying data structure. This is O(N log n), slightly better than the naive O(N log N) method if n << N.
You can do the "n'th element" algorithm conceptually in parallel on the two vectors quite easiely (at least the simple variant that's only linear in the average case).
Pick a pivot.
Partition (std::partition) both vectors by that pivot. You'll have the first vector partitioned by some element with rank i and the second by some element with rank j. I'm assuming descending order here.
If i+j < n, recurse on the right side for the n-i-j greatest elements. If i+j > n, recurse on the left side for the n greatest elements. If you hit i+j==n, stop the recursion.
You basically just need to make sure to partition both vectors by the same pivot in every step. Given a decent pivot selection, this algorithm is linear in the average case (and works in-place).
See also: http://en.wikipedia.org/wiki/Selection_algorithm#Partition-based_general_selection_algorithm
Edit: (hopefully) clarified the algorithm a bit.

Finding smallest value in an array most efficiently

There are N values in the array, and one of them is the smallest value. How can I find the smallest value most efficiently?
If they are unsorted, you can't do much but look at each one, which is O(N), and when you're done you'll know the minimum.
Pseudo-code:
small = <biggest value> // such as std::numerical_limits<int>::max
for each element in array:
if (element < small)
small = element
A better way reminded by Ben to me was to just initialize small with the first element:
small = element[0]
for each element in array, starting from 1 (not 0):
if (element < small)
small = element
The above is wrapped in the algorithm header as std::min_element.
If you can keep your array sorted as items are added, then finding it will be O(1), since you can keep the smallest at front.
That's as good as it gets with arrays.
You need too loop through the array, remembering the smallest value you've seen so far. Like this:
int smallest = INT_MAX;
for (int i = 0; i < array_length; i++) {
if (array[i] < smallest) {
smallest = array[i];
}
}
The stl contains a bunch of methods that should be used dependent to the problem.
std::find
std::find_if
std::count
std::find
std::binary_search
std::equal_range
std::lower_bound
std::upper_bound
Now it contains on your data what algorithm to use.
This Artikel contains a perfect table to help choosing the right algorithm.
In the special case where min max should be determined and you are using std::vector or ???* array
std::min_element
std::max_element
can be used.
If you want to be really efficient and you have enough time to spent, use SIMD instruction.
You can compare several pairs in one instruction:
r0 := min(a0, b0)
r1 := min(a1, b1)
r2 := min(a2, b2)
r3 := min(a3, b3)
__m64 _mm_min_pu8(__m64 a , __m64 b );
Today every computer supports it. Other already have written min function for you:
http://smartdata.usbid.com/datasheets/usbid/2001/2001-q1/i_minmax.pdf
or use already ready library.
If the array is sorted in ascending or descending order then you can find it with complexity O(1).
For an array of ascending order the first element is the smallest element, you can get it by arr[0] (0 based indexing).
If the array is sorted in descending order then the last element is the smallest element,you can get it by arr[sizeOfArray-1].
If the array is not sorted then you have to iterate over the array to get the smallest element.In this case time complexity is O(n), here n is the size of array.
int arr[] = {5,7,9,0,-3,2,3,4,56,-7};
int smallest_element=arr[0] //let, first element is the smallest one
for(int i =1;i<sizeOfArray;i++)
{
if(arr[i]<smallest_element)
{
smallest_element=arr[i];
}
}
You can calculate it in input section (when you have to find smallest element from a given array)
int smallest_element;
int arr[100],n;
cin>>n;
for(int i = 0;i<n;i++)
{
cin>>arr[i];
if(i==0)
{
smallest_element=arr[i]; //smallest_element=arr[0];
}
else if(arr[i]<smallest_element)
{
smallest_element = arr[i];
}
}
Also you can get smallest element by built in function
#inclue<algorithm>
int smallest_element = *min_element(arr,arr+n); //here n is the size of array
You can get smallest element of any range by using this function
such as,
int arr[] = {3,2,1,-1,-2,-3};
cout<<*min_element(arr,arr+3); //this will print 1,smallest element of first three element
cout<<*min_element(arr+2,arr+5); // -2, smallest element between third and fifth element (inclusive)
I have used asterisk (*), before min_element() function. Because it returns pointer of smallest element.
All codes are in c++.
You can find the maximum element in opposite way.
Richie's answer is close. It depends upon the language. Here is a good solution for java:
int smallest = Integer.MAX_VALUE;
int array[]; // Assume it is filled.
int array_length = array.length;
for (int i = array_length - 1; i >= 0; i--) {
if (array[i] < smallest) {
smallest = array[i];
}
}
I go through the array in reverse order, because comparing "i" to "array_length" in the loop comparison requires a fetch and a comparison (two operations), whereas comparing "i" to "0" is a single JVM bytecode operation. If the work being done in the loop is negligible, then the loop comparison consumes a sizable fraction of the time.
Of course, others pointed out that encapsulating the array and controlling inserts will help. If getting the minimum was ALL you needed, keeping the list in sorted order is not necessary. Just keep an instance variable that holds the smallest inserted so far, and compare it to each value as it is added to the array. (Of course, this fails if you remove elements. In that case, if you remove the current lowest value, you need to do a scan of the entire array to find the new lowest value.)
An O(1) sollution might be to just guess: The smallest number in your array will often be 0. 0 crops up everywhere. Given that you are only looking at unsigned numbers. But even then: 0 is good enough. Also, looking through all elements for the smallest number is a real pain. Why not just use 0? It could actually be the correct result!
If the interviewer/your teacher doesn't like that answer, try 1, 2 or 3. They also end up being in most homework/interview-scenario numeric arrays...
On a more serious side: How often will you need to perform this operation on the array? Because the sollutions above are all O(n). If you want to do that m times to a list you will be adding new elements to all the time, why not pay some time up front and create a heap? Then finding the smallest element can really be done in O(1), without resulting to cheating.
If finding the minimum is a one time thing, just iterate through the list and find the minimum.
If finding the minimum is a very common thing and you only need to operate on the minimum, use a Heap data structure.
A heap will be faster than doing a sort on the list but the tradeoff is you can only find the minimum.
If you're developing some kind of your own array abstraction, you can get O(1) if you store smallest added value in additional attribute and compare it every time a new item is put into array.
It should look something like this:
class MyArray
{
public:
MyArray() : m_minValue(INT_MAX) {}
void add(int newValue)
{
if (newValue < m_minValue) m_minValue = newValue;
list.push_back( newValue );
}
int min()
{
return m_minValue;
}
private:
int m_minValue;
std::list m_list;
}
//find the min in an array list of #s
$array = array(45,545,134,6735,545,23,434);
$smallest = $array[0];
for($i=1; $i<count($array); $i++){
if($array[$i] < $smallest){
echo $array[$i];
}
}
//smalest number in the array//
double small = x[0];
for(t=0;t<x[t];t++)
{
if(x[t]<small)
{
small=x[t];
}
}
printf("\nThe smallest number is %0.2lf \n",small);
Procedure:
We can use min_element(array, array+size) function . But it iterator
that return the address of minimum element . If we use *min_element(array, array+size) then it will return the minimum value of array.
C++ implementation
#include<bits/stdc++.h>
using namespace std;
int main()
{
int num;
cin>>num;
int arr[10];
for(int i=0; i<num; i++)
{
cin>>arr[i];
}
cout<<*min_element(arr,arr+num)<<endl;
return 0;
}
int small=a[0];
for (int x: a.length)
{
if(a[x]<small)
small=a[x];
}
C++ code
#include <iostream>
using namespace std;
int main() {
int n = 5;
int arr[n] = {12,4,15,6,2};
int min = arr[0];
for (int i=1;i<n;i++){
if (min>arr[i]){
min = arr[i];
}
}
cout << min;
return 0;
}