Handling duplicate values while merging k sorted arrays

Handling duplicate values while merging k sorted arrays - c++

I am trying to merge k sorted array of structs into a single one. I know the algorithm of using a min heap to merge the arrays. I am using priority_queue in C++ to implement the heap. My code looks like below.
struct Num {
int key;
int val;
}
// Struct used in priority queue.
struct HeapNode
{
Num num; // Holds one element.
int vecNum; //Array number from which the element is fetched.
int vecSize; // Holds the size of the array.
int next; // Holds the index of the next element to fetch.
};
// Struct used to compare nodes in a priority queue.
struct CompareHeapNode
{
bool operator()(const HeapNode& x, const HeapNode& y)
{
return (x.num.key < y.num.key) || ( (x.num.key == y.num.key)&&(x.num.val < y.num.val) );
}
};
vector<vector<Num>> v;
priority_queue< HeapNode, vector<HeapNode>, CompareHeapNode> p_queue;
//Insert the first element of the individual arrays into the heap.
while(!p_queue.empty())
{
Num x = p_queue.top();
cout << x.num.key << ' ' << x.num.val << '\n';
p_queue.pop();
if(x.next != x.vecSize) {
HeapNode hq = {v[x.vecNum][x.next], x.vecNum, x.vecSize, ++x.next};
p_queue.push(hq);
}
}
Let's consider 3 sorted arrays as shown below.
Array1: Array2: Array3:
0 1 0 10 0 0
1 2 2 22 1 2
2 4 3 46 2 819
3 7 4 71 3 7321
Now the problem is there can be some elements common among the arrays as show above. So while merging the arrays, duplicate values appear in the sorted array.
Are there any ways to handle duplicate keys?

So your question is that is there a way to check if the value you were inserting into the list were already in the list. Only if you could check that.
One solution is to use a hash table (unordered_set). Before inserting, check if element exists in it. If not, then insert that element in list and hash table.
But you can do better. Since you are merging sorted arrays, the output is also sorted. So, if duplicates exists, they will be together in the output array. So, before inserting, check the value with the last value of the output.

Related

To make array identical by swapping elements

There are 2 i/p array's. They are identical when they have exactly same numbers in it. To make them identical, we can swap their elements. Swapping will have cost. If we are swapping a and b elements then cost = min(a, b).
While making array's identical, cost should be minimum.
If it is not possible to make array identical then print -1.
i/p:
3 6 6 2
2 7 7 3
o/p :
4
Here I have swapped (2,7) and (2,6). So min Cost = 2 + 2 = 4.
Logic :
Make 2 maps which will store frequency of i/p array's elements.
if element "a" in aMap is also present in bMap, then we have to consider number of swapping for a = abs(freq(a) in aMap - freq(a) in bMap)
if frequency of elements is "odd", then not possible to make them identical.
else , add total swaps from both maps and find cost using
cost = smallest element * total swaps
Here is the code
#include<iostream>
#include<algorithm>
#include<map>
using namespace std;
int main()
{
int t;
cin >> t;
while(t--)
{
int size;
long long int cost = 0;
cin >> size;
bool flag = false;
map<long long int, int> aMap;
map<long long int, int> bMap;
// storing frequency of elements of 1st input array in map
for( int i = 0 ; i < size; i++)
{
long long int no;
cin >> no;
aMap[no]++;
}
// storing frequency of elements of 2nd input array in map
for(int i = 0 ; i < size; i++)
{
long long int no;
cin >> no;
bMap[no]++;
}
// fetching smallest element (i.e. 1st element) from both map
long long int firstNo = aMap.begin()->first;
long long int secondNo = bMap.begin()->first;
long long int smallestNo;
// finding smallest element from both maps
if(firstNo < secondNo)
smallestNo = firstNo;
else
smallestNo = secondNo;
map<long long int, int> :: iterator itr;
// trying to find out total number of swaps we have to perform
int totalSwapsFromA = 0;
int totalSwapsFromB = 0;
// trversing a map
for(itr = aMap.begin(); itr != aMap.end(); itr++)
{
// if element "a" in aMap is also present in bMap, then we have to consider
// number of swapping = abs(freq(a) in aMap - freq(a) in bMap)
auto newItr = bMap.find(itr->first);
if(newItr != bMap.end())
{
if(itr->second >= newItr->second)
{
itr->second -= newItr->second;
newItr->second = 0;
}
else
{
newItr->second -= itr->second;
itr->second = 0;
}
}
// if freq is "odd" then, this input is invalid as it can not be swapped
if(itr->second & 1 )
{
flag = true;
break;
}
else
{
// if freq is even, then we need to swap only for freq(a)/ 2 times
itr->second /= 2;
// if swapping element is smallest element then we required 1 less swap
if(itr->first == smallestNo && itr->second != 0)
totalSwapsFromA += itr->second -1;
else
totalSwapsFromA += itr->second;
}
}
// traversing bMap to check whether there any number is present which is
// not in aMap.
if(!flag)
{
for(itr = bMap.begin(); itr != bMap.end(); itr++)
{
auto newItr = aMap.find(itr->first);
if( newItr == aMap.end())
{
// if frew is odd , then i/p is invalid
if(itr->second & 1)
{
flag = true;
break;
}
else
{
itr->second /= 2;
// if swapping element is smallest element then we required 1 less swap
if(itr->first == smallestNo && itr->second != 0)
totalSwapsFromB += itr->second -1;
else
totalSwapsFromB += itr->second;
}
}
}
}
if( !flag )
{
cost = smallestNo * (totalSwapsFromB + totalSwapsFromA);
cout<<"cost "<<cost <<endl;
}
else
cout<<"-1"<<endl;
}
return 0;
}
No error in the above code but giving wrong answer and not getting accepted.
Can anyone improve this code / logic ?

Suppose you have 2 arrays:
A: 1 5 5
B: 1 4 4
We know that we want to move a 5 down and a 4 up, so we have to options: swapping 4 by 5 (with cost min(4, 5) = 4) or using the minimum element to do achive the same result, making 2 swaps:
A: 1 5 5 swap 1 by 4 (cost 1)
B: 1 4 4
________
A: 4 5 5 swap 1 by 5 (cost 1)
B: 1 1 4
________
A: 4 1 5 total cost: 2
B: 5 1 4
So the question we do at every swap is this. Is it better to swap directly or swapping twice using the minimum element as pivot?
In a nutshell, let m be the minimum element in both arrays and you want to swap i for j. The cost of the swap will be
min( min(i,j), 2 * m )
So just find out all the swaps you need to do, apply this formula and sum the results to get your answer.

#user1745866 You can simplify your task of determining the answer -1 by using only variable:
let we have int x=0 and we will just do XOR of all the i/p integers like this:
int x = 0;
for(int i=0;i<n;i++){
cin>>a[i];
x = x^a[i];
}
for(int i=0;i<n;i++){
cin>>b[i];
x = x^b[i];
}
if(x!=0)
cout<<-1;
else{
...do code for remain 2 condition...
}
Now the point is how it will work because , as all the numbers of both array should occurs only even number of times and when we do XOR operation of any number which occured even number of times we will get 0.... otherwise they can't be identical arrays.
Now for 2nd condition(which gives answer 0) you should use multimap so you would be able to directly compare both arrays in O(n) time complexity as if all elements of both arrays are same you can output:0
(Notice: i am suggesting multimap because 1:You would have both array sorted and all elements would be there means also duplicates.
2: because they are sorted, if they consist of same element at same position we can output:0 otherwise you have to proceed further for your 3rd condition or have to swap the elements.)

For reducing the swap cost see Daniel's answer. For finding if the swap is actually possible, please do the following, the swaps are actually only possible if you have an even number of elements in total, so that you can split them out evenly, so if you have 2, 4 or 6 5's you are good, but if you have 1, 3, or 5 5's return -1. It is impossible if your number of duplicates of a number is odd. For actually solving the problem, there is a very simple solution I can think of, through it is a little bit expensive, you just need to make sure that there are the same number of elements on each side so the simple way to do that would be to declare a new array:
int temp[size of original arrays];
//Go through both arrays and store them in temp
Take half of each element, so something like:
int count[max element in array - min element in array];
for(int i = 0; i < temp.size(); i++){
count[temp[i]]++;
}
Take half of each element from temp. When you see an element that matches a element on your count array so whenever you see a 1 decrement the index on the count array by 1, so something like count[1]--; Assuming count starts at 0. If the index is at zero and the element is that one, that means a swap needs to be done, in this case find the next min in the other array and swap them. Albeit a little bit expensive, but it is the simplest way I can think of. So for example in your case:
i/p:
3 6 6 2
2 7 7 3
o/p :
4
We would need to store the min index as 2. Cause that is the smallest one. So we would have an array that looks like the following:
1 1 0 0 1 1
//one two one three zero four zero five 1 six and 1 seven
You would go through the first array, when you see the second six, your array index at 6 would be zero, so you know you need to swap it, you would find the min in the other array, which is 2 and then swap 6 with 2, after wards you can go through the array smoothly. Finally you go through the second array, afterwards when you see the last 7 it will look for the min on the other side swap them...., which is two, note that if you had 3 twos on one side and one two on the other, chances are the three twos will go to the other side, and 2 of them will come back, because we are always swapping the min, so there will always be an even number of ways we can rearrange the elements.

Problem link https://www.codechef.com/JULY20B/problems/CHFNSWPS
here for calculating minimum number of swap.we will having 2 cases
let say an example
l1=[1,2,2]
l2=[1,5,5]
case 1. swap each pair wrt to min(l1,l2)=1
step 1 swapping single 2 of a pair of 2 from l1-> [1,1,2]
[2,5,5] cost is 1
step 2 swapping single 5 of a pair of 5 from l1-> [1,5,2]
[2,1,5] cost is 1
total cost is 2
case 2. swap min of l1 with max of l2(repeat until both list end)
try to think if we sort 1st list in increasing order and other as decreasing order then we can minimize cost.
l1=[1,2,2]
l2=[5,5,1]
Trick is that we only need to store min(l1,l2) in variable say mn. Then remove all common element from both list.
now list became l1=[2,2]
l2=[5,5]
then swap each element from index 0 to len(l1)-1 with jump of 2 like 0,2,4,6..... because each odd neighbour wiil be same as previous number.
after perform swapping cost will be 2 and
l1=[5,2]
l2=[2,5] cost is 2
total cost is 2
Let say an other example
l1=[2,2,5,5]
l2=[3,3,4,4]
after solving wrt to min(l1,l2) total cost will be 2+2+2=6
but cost after sorting list will be swap of ((2,4) and (5,3)) is 2+3=5
so minimum swap to make list identical is min(5,6)=5
//code
l1.sort()
l2.sort(reverse=True)
sums=0
for i in range(len(l1)):
sums+=min(min(l1[i],l2[i]),2*minimum))
print(sums)
#print -1 if u get odd count of a key in total (means sums of count of key in both list)

Using sort function to sort vector of tuples in a chained manner

So I tried sorting my list of tuples in a manner that next value's first element equals the second element of the present tuple.(first tuple being the one with smallest first element)
(x can be anything)
unsorted
3 5 x
4 6 x
1 3 x
2 4 x
5 2 x
sorted
1 3 x
3 5 x
5 2 x
2 4 x
4 6 x
I used the following function as my third argument in the custom sort function
bool myCompare(tuple<int,int,int>a,tuple<int,int,int>b){
if(get<1>(a) == get<2>(b)){
return true;
}
return false;
}
But my output was unchanged. Please help me fix the function or suggest me another way.

this can't be achieved by using std::sort with a custom comparison function. Your comparison function doesn't establish a strict weak order onto your elements.
The std::sort documentation states that the comparison function has to fulfill the Compare requirements. The Comparison requirements say the function has to introduce a strict weak ordering.
See https://en.wikipedia.org/wiki/Weak_ordering for the properties of a strict weak order
Compare requirements: https://en.cppreference.com/w/cpp/named_req/Compare
The comparison function has to return true if the first argument is before the second argument with respect to the strict weak order.
For example the tuple a=(4, 4, x) violates the irreflexivity property comp(a, a) == false
Or a=(4, 6, x) and b=(6, 4, y) violate the asymmetry property that if comp(a, b) == true it is not the case that comp(b, a) == true

I am not sure, where the real problem is coming from.
But the background is the Cyclic Permutation Problem.
In your special case you are looking for a k-cycle where k is equal to the count of tuples. I drafted a solution for you that will show all cycles (not only the desired k-cycle).
And I use the notation described int the provided link. The other values of the tuple are irrelevant for the problem.
But how to implement?
The secret is to select the correct container types. I use 2. For a cyle, I use a std::unordered_set. This can contain only unique elements. With that, an infinite cycle will be prevented. For example: 0,1,3,0,1,3,0,1,3 . . . is not possible, because each digit can only be once in the container. That will stop our way through the permutations. As soon as we see a number that is already in a cycle, we stop.
All found cycles will be stored in the second container type: A std::set. The std::set can also contain only unique values and, the values are ordered. Because we store complex data in the std::set, we create a custom comparator for it. We need to take care that the std::set will not contain 2 double entries. And double would be in our case also 0,1,3 and 1,3,0. In our custom comparator, we will first copy the 2 sets into a std::vector and sort the std::vectors. This will make 1,3,0 to 0,1,3. Then we can easily detect doubles.
Please note:
I do always only store a value from the first permutation in the cycle. The 2nd is used as helper, to find the index of the next value to evaluate.
Please see the below code. I will produces 4 non trivial cycles- And one has the number of elements as expected: 1,3,5,2,4.
Porgram output:
Found Cycles:
(1,3,5,2,4)(3,5,2,4)(2,4)(5,2,4)
Please digest.
#include <iostream>
#include <vector>
#include <algorithm>
#include <unordered_set>
#include <iterator>
#include <set>
// Make reading easier and define some alies names
using MyType = int;
using Cycle = std::unordered_set<MyType>;
using Permutation = std::vector<MyType>;
using Permutations = std::vector<Permutation>;
// We do not want to have double results.
// A double cyle is also a Cycle with elements in different order
// So define custom comparator functor for our resulting set
struct Comparator {
bool operator () (const Cycle& lhs, const Cycle& rhs) const {
// Convert the unordered_sets to vectors
std::vector<MyType> v1(lhs.begin(), lhs.end());
std::vector<MyType> v2(rhs.begin(), rhs.end());
// Sort them
std::sort(v1.begin(), v1.end());
std::sort(v2.begin(), v2.end());
// Compare them
return v1 < v2;
}
};
// Resulting cycles
using Cycles = std::set<Cycle, Comparator>;
int main() {
// The source data
Permutations perms2 = {
{3,4,1,2,5},
{5,6,3,4,2} };
// Lamda to find the index of a given number in the first permutation
auto findPos = [&perms2](const MyType& m) {return std::distance(perms2[0].begin(), std::find(perms2[0].begin(), perms2[0].end(), m)); };
// Here we will store our resulting set of cycles
Cycles resultingCycles{};
// Go through all single elements of the first permutation
for (size_t currentColumn = 0U; currentColumn < perms2[0].size(); ++currentColumn) {
// This is a temporary for a cycle that we found in this loop
Cycle trialCycle{};
// First value to start with
size_t startColumn = currentColumn;
// Follow the complete path through the 2 permutations
for (bool insertResult{ true }; insertResult; ) {
// Insert found element from the first permutation in the current cycle
const auto& [newElement, insertOk] = trialCycle.insert(perms2[0][startColumn]);
// Find the index of the element under the first value (from the 2nd permutation)
startColumn = findPos(perms2[1][startColumn]);
// Check if we should continue (Could we inster a further element in our current cycle)
insertResult = insertOk && startColumn < perms2[0].size();
}
// We will only consider cycles with a length > 1
if (trialCycle.size() > 1) {
// Store the current temporary cycle as an additional result.
resultingCycles.insert(trialCycle);
}
}
// Simple output
std::cout << "\n\nFound Cycles:\n\n";
// Go through all found cycles
for (const Cycle& c : resultingCycles) {
// Print an opening brace
std::cout << "(";
// Handle the comma delimiter
std::string delimiter{};
// Print all integer values of the cycle
for (const MyType& m : c) {
std::cout << delimiter << m;
delimiter = ",";
}
std::cout << ")";
}
std::cout << "\n\n";
return 0;
}

How can I insert multiple numbers to a particular element of a vector?

I am quite new to C++ and vector. I am calculating two things say 'i' and 'x' and I want to add 'x' that belongs to a particular vector element 'i'. I learned that if I have one 'x' value, I can simply do that by 'vec.at(i) = x'. But what if I want to add several 'x' values to a particular 'i' index of a vector?
Let's try to make it clear: Let's say I am searching for number '5' and '3' over a list of numbers from 1 to 10 (5 and 3 can occur multiple times in the list) and each time I am looking for number 5 or 3 that belong to index '2' of 'vec' I can do 'vec.at(2) = 5' or 'vec.at(2) = 3'. Then what if I have two '5' values and two '3' values so the sum of the index '2' of 'vec' will be '5+5+3+3' = 16?
P.S: using a counter and multiply concept will not solve my problem as the real problem is quite complicated. This query is just an example only. I want a solution within vector concept. I appreciate your help in advance.

If you know how many indices you want ahead of time, then try std::vector<std::vector<int>> (or instead of int use double or whatever).
For instance, if you want a collection of numbers corresponding to each number from 0 to 9, try
//This creates the vector of vectors,
//of length 10 (i.e. indices [0,9])
//with an empty vector for each element.
std::vector<std::vector<int>> vec(10, std::vector<int>());
To insert an element at a given index (assuming that there is something there, so in the above case there is only 'something there' for elements 0 through 9), try
vec.at(1).push_back(5);
vec.at(1).push_back(3);
And then to take the sum of the numbers in the vector at index 1:
int sum = 0;
for (int elem : vec.at(1)) { sum += elem; }
//sum should now be 8
If you want it to work for arbitrary indices, then it should be
std::map<int, std::vector<int>> map;
map[1].push_back(5); //creates an empty vector at index 1, then inserts
map[1].push_back(3); //uses the existing vector at index 1
int sum = 0;
for (int elem : map.at(1)) { sum += elem; }
Note that for std::vector and std::map, using [] do very different things. Most of the time you want at, which is about the same for both, but in this very specific case, [] for std::map is a good choice.
EDIT: To sum over every element in every vector in the map, you need an outer loop to go through the vectors in the map (paired with their index) and an inner loop like the one above. For example:
int sum = 0;
for (const std::pair<int, std::vector<int>>& index_vec : map) {
for (int elem : index_vec.second) { sum += elem; }
}

Array of structures in C++:assignment of values to each structure in array

I have declared a structure st below with a struct variable arr[], an array of structs.
Im trying to assign the value 1 to the 'num' variable, and values 1 to 10 to 'val' variable of the first 10 locations of array arr[]. And value 2 to 'num' and values 1 to 10 to 'val' of the next 10 locations. But when i traced the code, it won't assign values to the respective num and val of the same array location. If i wanted to assign num=1 and val=4 to the 4th structure it would assign num=1 to val of 3rd structure and val=4 to num of 4th structure.
My query is not about array indices.
The problem is:
If i wrote the statements
arr[2].num=1;
arr[2].val=2;
({num,val})
The expected result is: arr[2]={1,2}
But the actual result is:
arr[1]={num,1}
arr[2]={2,val}
#include<iostream.h>
#include<conio.h>
class abc
{
public:
struct st
{
int num;
int val;
};
st arr[21];
void funct();
};
void abc::funct()
{
int i,j,k=1;
for(i=1;i<=2;i++)
{
for(j=1;j<=10;j++)
{
arr[k].num=i;
arr[k].val=j;
k++;
}
}
}
int main()
{
abc z;
z.funct();
return 0;
}

1) Arrays are 0 based, i.e. index starts with 0 and goes upto arraySize - 1 (from your declaration).
2) Walk through your code and look at what each line is doing...
3) Now think about what you need to do:
3.1) iterate over each element of the array
3.2) for each element access the structure
3.3) inside the structure you wanted to set
num = 1 to the first 10 elements and 2 to second 10 elements (do you see any simple mathematical rule here?)
val = arrayElementIndex (this is too simple)
Look at your code and think about how it needs to be done.

http://www.cplusplus.com/doc/tutorial/arrays/
If you want to access the first element in the array named arr then you do it by arr[0].
So it will probably help if you do this:
int i,j,k=0;

How to sort both data member arrays of a class in code shown

I have a simple C++ code written to understand sort function usage on user defined class data types to sort data members of a class.
But this only sorts array of variable b in the class. Because, the earlier sorted array of variable a, is also disturbed whiel sorting b.
#include <iostream>
#include <cstring>
using namespace std;
class Entry
{
public:
int a, b;
};
bool compare1(Entry e1, Entry e2)
{
if (e1. a > e2. a) return false;
return true;
}
bool compare2( Entry e1, Entry e2)
{
if (e1. b > e2. b) return false;
return true;
}
int main()
{
int i;
vector<Entry> array(4);
array[0]. a =5 , array[0]. b =8 ;
array[1]. a =10 , array[1]. b =4 ;
array[2]. a =3 , array[2]. b =2 ;
array[3]. a =1 , array[3]. b =12 ;
sort(array.begin(), array.end(), compare1);
sort(array.begin(), array.end(), compare2);
cout << "sorted:" << endl;
for (i = 0; i< 4; i++)
cout << array[i]. a << " " << array[i].b << endl;
}
The output I get is as follows:
sorted:
3 2
10 4
5 8
1 12
How to sort both data member arrays - a,b?

It depends on how you want your elements to be sorted:
Sort as pairs, keyed on a: (1,12), (3,2), (5,8), (10,4)
Sort as pairs, keyed on b: (3,2), (10,4), ...
Sort as pairs lexicographically: same as sorting on a, since there are no repeated values for a.
In case (1) you use compare1, in case (2) you use compare2. (For case (3) you would have to write another predicate, or just use std::pair<int,int>.)
Case 4: If you want the values of a and b sorted separately and destroy the pairing, then you need to put the values into separate vectors of ints and sort those individually:
std::vector<int> avals(array.size()), bvals(array.size());
for (size_t i = 0; i != array.size(); ++i)
{
avals[i] = array[i].a;
bvals[i] = array[i].b;
}
std::sort(avals.begin(), avals.end());
std::sort(bvals.begin(), bvals.end());
There is no way around this. A container of Entry objects can only move elements around as a whole.

I think you are confusing what sort does. It does not hort the members that you use in the comparison function, but rather the whole objects. In your case that means that the because you initialized one object with value pair (5,8), there will always be an element in the vector that is (5,8).
Sorting the array by the first member means that it move to the second to last position (5 being the second to last biggest first element), and sorting by the second element will move the object to, well in this case also the second to last position, but that will only move the object in the container, it will always be (5,8).

If you want both to impact the comparison, include both in the check (i.e. compare) function. Else, it's either one or the other. If you need to have different "views", you need a smarter container (such as boost::multi_index)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Handling duplicate values while merging k sorted arrays - c++

Related

To make array identical by swapping elements

Using sort function to sort vector of tuples in a chained manner

How can I insert multiple numbers to a particular element of a vector?

Array of structures in C++:assignment of values to each structure in array

How to sort both data member arrays of a class in code shown

Categories

Resources