Finding duplicated elements in 2 arrays without using nested loop

Finding duplicated elements in 2 arrays without using nested loop - c++

I want a program to find duplicated elements in 2 arrays without using 2 nested loop.
I've tried 2 for loop but it takes too much time.
Here what I have done:
for(j = 0; j < n; j++){
for(i = 0; i < m; i++){
if(arr1[i] == arr2[j]){
// function
} else if(arr1[i] != arr2[j]) {
// another function
}
}
}

Build a hashset from elements from array1, then iterate over array2 to find duplicates.

This solution will show you 3 methods and measure the time that they need.
Your approach, using a nested loop
Using std::set_intersection
Using std::unordered_set
There are of course more possible solutions.
Please see:
#include <iostream>
#include <iterator>
#include <random>
#include <chrono>
#include <algorithm>
#include <unordered_set>
constexpr size_t ArraySize1 = 100000u;
constexpr size_t ArraySize2 = 150000u;
int main() {
int arr1[ArraySize1], arr2[ArraySize2];
// ---------------------------------------------------------------
// Create some random numbers and fill both arrays with it
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> distrib(1, 2000000000);
for (int& i : arr1) i = distrib(gen);
for (int& i : arr2) i = distrib(gen);
// ---------------------------------------------------------------
// Test algorithms
// 1. Nested loops
auto start = std::chrono::system_clock::now();
// ---
for (size_t k = 0; k < ArraySize1; ++k)
for (size_t i = 0; i < ArraySize2; ++i)
if (arr1[k] == arr2[i])
std::cout << arr1[k] << '\n';
// ---
auto elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start);
std::cout << "Time with nested loops: " << elapsed.count() << " ms\n\n";
// 2. Set intersection
start = std::chrono::system_clock::now();
// ---
std::sort(std::begin(arr1), std::end(arr1));
std::sort(std::begin(arr2), std::end(arr2));
std::set_intersection(std::begin(arr1), std::end(arr1), std::begin(arr2), std::end(arr2), std::ostream_iterator<int>(std::cout, "\n"));
// ---
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start);
std::cout << "Time with set_intersection: " << elapsed.count() << " ms\n\n";
// 3. std::unordred_set
start = std::chrono::system_clock::now();
std::unordered_set<int> setArray1(std::begin(arr1),std::end(arr1));
for (const int i : arr2) {
if (setArray1.count(i)) {
std::cout << i << '\n';
}
}
elapsed = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now() - start);
std::cout << "Time with unordered set: " << elapsed.count() << " ms\n\n";
}

using bool visited array for array1, then check duplicates in
array2 [it depends on array elements limitation]
using map or set C++ STL
using Trie data structure (advanced technique)

Related

C++ : Program fails during vector filling and searching

I'm practicing C++ vector and as an exercise I want to fill a vector with 16 million random numbers and then find the position of the first occurrence of a number. The code which I implemented so far is this:
int getIndexOf(std::vector<int>& v, int num) {
for(std::size_t i=0; i < v.size(); i++) {
if(v.at(i) == num) {
return i;
}
}
return -1;
}
int main() {
int searchedNumber = 42;
int vectorSize = 16000000;
std::vector<int> v(vectorSize);
for(std::size_t i=0; i < v.size(); i++) {
v.push_back(rand() % 10000000);
}
//Linear search
auto start = std::chrono::high_resolution_clock::now();
int position = getIndexOf(v, searchedNumber);
auto stop = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::seconds>(stop - start);
std::cout << "The linear search took: " << duration.count() << " seconds" << std::endl;
std::cout << "The number " << searchedNumber << " occur first at position " << position << std::endl;
return 0;
}
Additionally I measure the time just for some statistics. The problem is that the program crash with error bad_alloc which I link with Running out of stack space. So initially I though that filling a vector with so many numbers when the vector is on the stack will be a reason for the crash and I created the vector dynamically (pointer). However, I still get the same error. What might be the reason for this?

int vectorSize = 16000000;
std::vector<int> v(vectorSize);
for(std::size_t i=0; i < v.size(); i++) {
v.push_back(rand() % 10000000);
}
This part is bad. push_back() adds an element to the vector, so it increases size(). Therefore, this loop won't terminate until something bad happens.
You should do like this instead:
int vectorSize = 16000000;
std::vector<int> v;
v.reserve(vectorSize); // allocate memory without actually adding elements
for(int i=0; i < vectorSize; i++) { // use the known size
v.push_back(rand() % 10000000);
}

Why C++ array class is taking more time to operate on, than the C-style array?

I have written a simple code to compare the time taken to operate on the elements of two arrays (both of same size), one defined by C++ array class and the other by plain C-style array. The code I have used is
#include <iostream>
#include <array>
#include <chrono>
using namespace std;
const int size = 1E8;
const int limit = 1E2;
array<float, size> A;
float B[size];
int main () {
using namespace std::chrono;
//-------------------------------------------------------------------------------//
auto start = steady_clock::now();
for (int i = 0; i < limit; i++)
for (int j = 0; j < size; j++)
A.at(j) *= 1.;
auto end = steady_clock::now();
auto span = duration_cast<seconds> (end - start).count();
cout << "Time taken for array A is: " << span << " sec" << endl;
//-------------------------------------------------------------------------------//
start = steady_clock::now();
for (int i = 0; i < limit; i++)
for (int j = 0; j < size; j++)
B[j] *= 1.;
end = steady_clock::now();
span = duration_cast<seconds> (end - start).count();
cout << "Time taken for array B is: " << span << " sec" << endl;
//-------------------------------------------------------------------------------//
return 0;
}
which I have compiled and run with
g++ array.cxx
./a.out
The output I get is the following
Time taken for array A is: 52 sec
Time taken for array B is: 22 sec
Why does the C++ array class takes much longer to operate on?

The std::array::at member function does bounds-checking so, of course, there is some extra overhead. If you want a fairer comparison use std::array::operator[], just like the plain array.

How do I Reverse Display a Linked List?

I have a function that inserts random integers into a list, and a function that displays the list. With what i have now, is there a way to display that list in reverse?
void InsertRandomInts()
{
LinkedSortedList<int> list;
srand((unsigned)time(NULL));
for (int i = 0; i < 50; ++i)
{
int b = rand() % 100 + 1;
list.insertSorted(b);
}
displayListForward(&list);
}
void displayListForward(SortedListInterface<int>* listPtr)
{
cout << "The sorted list contains " << endl;
for (int pos = 1; pos <= listPtr->getLength(); pos++)
{
cout << listPtr->getEntry(pos) << " ";
}
cout << endl << endl;
}

Iterate the list from rbegin() to rend() and print it. You will be printing it in reverse.
Either 1) stop reinventing the wheel and just use a standard container that has these functions. Or 2) implement rbegin() & rend() for your custom container.
Like
for (auto it = list.rbegin(); it != it.rend(); ++it)
// Print *it

A good idea would be to get rid of that non-standard generic container and instead use std::list (or really just std::vector if you don't need list-specific semantics such as being able to remove an element without invaliding iterators to other elements).
The sort member function can be applied after all items have been added. You can then finally use rbegin and rend for reverse iteration.
Here is a simple example:
#include <iostream>
#include <list>
#include <cstdlib>
#include <ctime>
void DisplayListForward(std::list<int>& list)
{
std::cout << "The sorted list contains\n";
for (auto iter = list.rbegin(); iter != list.rend(); ++iter)
{
std::cout << *iter << " ";
}
std::cout << '\n';
}
void InsertRandomInts()
{
std::list<int> list;
std::srand(static_cast<unsigned>(std::time(nullptr)));
for (int i = 0; i < 50; ++i)
{
auto const b = std::rand() % 100 + 1;
list.push_back(b);
}
list.sort();
DisplayListForward(list);
}
int main()
{
InsertRandomInts();
}
But this may be overkill; for a quick solution, just reverse your current loop:
for (int pos = listPtr->getLength(); pos >= 1; pos--)

Why does the combination of find + insert work faster than the single insert statements

Why does the combination of find + insert work faster than the single insert statements?
#include <chrono>
#include <iostream>
#include <unordered_set>
int main()
{
{
auto t1 = std::chrono::high_resolution_clock::now();
auto elements = 100000000;
std::unordered_set<int> s;
s.reserve(elements);
for (int i = 0; i < elements; ++i)
{
auto it = s.find(i % 2);
if (it == s.end())
{
s.insert(i % 2);
}
}
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << std::endl;
}
{
auto t1 = std::chrono::high_resolution_clock::now();
auto elements = 100000000;
std::unordered_set<int> s;
s.reserve(elements);
for (int i = 0; i < elements; ++i)
{
s.insert(i % 2);
}
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t2 - t1).count() << std::endl;
}
}
This code gives me the following results in MSVC-14.0 (Release configuration ofc):
716
1005

Since the elements you're adding most of the time are already in the set, insert has more work to do thatn find as it needs to construct a pair with the iterator pointing to the existing element in the set and a boolean to indicate that the element is already there. find only has to return the iterator. You can look at the library code to see this.
A more accurate title and question almost gives you the answer. Since you're only inserting two elements, then checking for around 100,000,000 more, a better title would end with "when the elements are already in the set". A better question is "why does find work faster than insert?".

swap array values in c++

I want to shift left array values if my v=4 is in a[n],remove 4 from a[n] and at the end index add 0,how i can do this?
#include <iostream>
using namespace std;
const int n=5;
int main()
{
int a[n]={1,5,4,6,8}, v=4;
int b[n];
cout << "Enter a Value" << endl;
cout<<v<<endl;
for(int i=0; i<n; i++){
cout<<a[i];
}
cout<<endl;
for(int j=0; j<n; j++){
b[j]=a[j];
if(a[j]==v)
b[j]=a[++j];
cout<<b[j];
}
return 0;
}

#include <vector> // needed for vector
#include <algorithm> // needed for find
#include <iostream> // needed for cout, cin
using namespace std;
// Vectors are just like dynamic arrays, you can resize vectors on the fly
vector<int> a { 1,5,4,6,8 }; // Prepare required vector
int v;
cout << "enter value"; // Read from user
cin >> v;
auto itr = find( a.begin(), a.end(), v); // Search entire vector for 'v'
if( itr != a.end() ) // If value entered by user is found in vector
{
a.erase(itr); // Delete the element and shift everything after element
// Toward beginning of vector. This reduces vector size by 1
a.push_back(0); // Add 0 in the end. This increases vector size by 1
}
for( int i : a ) // Iterate through all element of a (i holds element)
cout << i; // Print i
cout << '\n'; // Line end
a few helpful links:
vector , find , iterator , erase , push_back

You could use std::rotate. I suggest that you use std::vector instead of C arrays and take full advantage of the STL algorithms. Nevertheless, below I'm illustrating two versions one with C arrays and one with std::vector:
Version with C array:
#include <iostream>
#include <algorithm>
int main()
{
int const n = 5;
int a[n] = {1,5,4,6,8};
std::cout << "Enter a Value" << std::endl;
int v;
std::cin >> v;
for(auto i : a) std::cout << i<< " ";
std::cout << std::endl;
auto it = std::find(std::begin(a), std::end(a), v);
if(it != std::end(a)) {
std::rotate(it + 1, it, std::end(a));
a[n - 1] = 0;
}
for(auto i : a) std::cout << i<< " ";
std::cout << std::endl;
return 0;
}
Version with vector:
#include <iostream>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> a{1,5,4,6,8};
std::cout << "Enter a Value" << std::endl;
int v;
std::cin >> v;
for(auto i : a) std::cout << i<< " ";
std::cout << std::endl;
auto it = std::find(std::begin(a), std::end(a), v);
if(it != std::end(a)) {
std::rotate(it + 1, it, std::end(a));
a.back() = 0;
}
for(auto i : a) std::cout << i<< " ";
std::cout << std::endl;
return 0;
}

Here's an example using std::array
#include <array>
#include <algorithm>
// defines our array.
std::array<int, 5> a = {{ 1, 2, 3, 4, 5 }};
// find the position of the element with the value 4.
auto where = std::find(a.begin(), a.end(), 4);
// if it wasn't found, give up
if (where == a.end())
return 0;
// move every element past "where" down one.
std::move(where + 1, a.end(), where);
// fill the very last element of the array with zero
a[ a.size() - 1] = 0;
// loop over our array, printing it to stdout
for (int i : a)
std::cout << i << " ";
std::cout << "\n";
Why would anyone use these awkward algorithms? Well, there are a few reasons. Firstly, they are container-independant. This will work with arrays and vectors and deques, no problem. Secondly, they can be easily used to work with a whole range of elements at once, not just single items, and can copy between containers and so on. They're also type-independant... you acn have an array of strings, or an vector of ints, or other more complex things, and the algorithms will still work just fine.
They're quite powerful, once you've got over their initial user-unfriendliness.
You can always use either std::array or std::vector or whatever without using the standard library algorithms, of course.
std::array<int, 5> a = {{ 1, 2, 3, 4, 5 }};
size_t where = 0;
int to_remove = 4;
// scan through until we find our value.
while (a[where] != to_remove && where < a.size())
where++;
// if we didn't find it, give up
if (where == a.size())
return 0;
// shuffle down the values
for (size_t i = where; i < a.size() - 1; i++)
a[i] = a[i + 1];
// set the last element to zero
a[ a.size() - 1] = 0;
As a final example, you can use memmove (as suggested by BLUEPIXY) to do the shuffling-down operation in one function call:
#include <cstring>
if (where < a.size() - 1)
memmove(&a[where], &a[where + 1], a.size() - where);

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Finding duplicated elements in 2 arrays without using nested loop - c++

I want a program to find duplicated elements in 2 arrays without using 2 nested loop. I've tried 2 for loop but it takes too much time. Here what I have done: for(j = 0; j < n; j++){ for(i = 0; i < m; i++){ if(arr1[i] == arr2[j]){ // function } else if(arr1[i] != arr2[j]) { // another function } } }

Build a hashset from elements from array1, then iterate over array2 to find duplicates.

using bool visited array for array1, then check duplicates in array2 [it depends on array elements limitation] using map or set C++ STL using Trie data structure (advanced technique)

Related

C++ : Program fails during vector filling and searching

Why C++ array class is taking more time to operate on, than the C-style array?

How do I Reverse Display a Linked List?

Why does the combination of find + insert work faster than the single insert statements

swap array values in c++

Categories

Resources