Keep track of highest 5 numbers during file input

Keep track of highest 5 numbers during file input - c++

So lets say i have a struct
struct largestOwners
{
string name;
double amountOwned;
};
And i am reading it in from a file using ifstream with 300 names and amounts.
How can i go about keeping track of the highest 5 numbers during input? So i dont have to sort after, but rather track it during ifstream input.
My goal is to keep track of the 5 highest amounts during input so i can easily print it out later. And save time/processing rather than to do it in the future
I get i can store this in an array or another struct, but whats a good algorithm for tracking this during ifstream input to the struct?
Lets say the text file looks like this, when im reading it in.
4025025 Tony
66636 John
25 Tom
23693296 Brady
363 Bradley
6200 Tim
Thanks!

To keep track of the highest 5 numbers in a stream of incoming numbers, you could use a min-heap of size 5 (C++ STL set can be used as a min-heap).
First fill the min-heap with the first 5 numbers. After that, for each incoming element, compare that with the smallest of the largest 5 numbers that you have (root of the min-heap). If the current number is smaller than that, do nothing, otherwise remove the 5th largest (pop from min-heap) and insert the current number to the min-heap.
Deleting and inserting in the min-heap will take O(log n) time.
For example, consider the following stream of numbers:
1 2 5 6 3 4 0 10 3
The min-heap will have 1 2 3 5 6 initially.
On encountering 4, 1 gets removed and 4 gets inserted.
Min heap now looks like this: 2 3 4 5 6
On encountering 0, nothing happens.
On encountering 10, 2 gets removed and 10 gets inserted.
Min heap now looks like this: 3 4 5 6 10
On encountering 3, nothing happens.
So your final set of 5 largest elements are contained in the heap (3 4 5 6 10)
You can even tweak this to keep track of the k highest elements in an incoming stream of numbers. Just change the size of the min-heap to k.

While reading the file, keep a sorted list of the 5 largest numbers seen (and their owners).
Whenever you read a value higher than the lowest of the 5, remove the lowest and insert the new number in your sorted list.
List list can be stored in an array or in any other data structure that has an order and where you can implement a sort and insert. (Or where this is already implemented)
Instead of sorting the list you can also simply go through the 5 entries every time you read a new one (should not be too bad, because 5 entries is a very small number)

You can use the standard library function std::nth_element() for this.
It should be fairly easy to implement a comparison function (or overload the comparison operator) for your struct. Then you'd just parse the file into a vector of those and be done with it. The algorithm uses a partial sort, linear time on average.
Here's the example given on the documentation site I've linked below:
#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>
int main()
{
std::vector<int> v{5, 6, 4, 3, 2, 6, 7, 9, 3};
std::nth_element(v.begin(), v.begin() + v.size()/2, v.end());
std::cout << "The median is " << v[v.size()/2] << '\n';
std::nth_element(v.begin(), v.begin()+1, v.end(), std::greater<int>());
std::cout << "The second largest element is " << v[1] << '\n';
}
For reference:
http://en.cppreference.com/w/cpp/algorithm/nth_element
Out of curiosity, I have implemented some approaches:
#include <algorithm>
#include <functional>
#include <queue>
#include <set>
#include <vector>
std::vector<int> filter_nth_element(std::vector<int> v, int n) {
auto target = v.begin()+n;
std::nth_element(v.begin(), target, v.end(), std::greater<int>());
std::vector<int> result(v.begin(), target);
return result;
}
std::vector<int> filter_pqueue(std::vector<int> v, int n) {
std::vector<int> result;
std::priority_queue<int, std::vector<int>, std::greater<int>> q;
for (auto i: v) {
q.push(i);
if (q.size() > n) {
q.pop();
}
}
while (!q.empty()) {
result.push_back(q.top());
q.pop();
}
return result;
}
std::vector<int> filter_set(std::vector<int> v, int n) {
std::set<int> s;
for (auto i: v) {
s.insert(i);
if (s.size() > n) {
s.erase(s.begin());
}
}
return std::vector<int>(s.begin(), s.end());
}
std::vector<int> filter_deque(std::vector<int> v, int n) {
std::deque<int> q;
for (auto i: v) {
q.push_back(i);
if (q.size() > n) {
q.erase(std::min_element(q.begin(), q.end()));
}
}
return std::vector<int>(q.begin(), q.end());
}
std::vector<int> filter_vector(std::vector<int> v, int n) {
std::vector<int> q;
for (auto i: v) {
q.push_back(i);
if (q.size() > n) {
q.erase(std::min_element(q.begin(), q.end()));
}
}
return q;
}
And I have made up some tests:
#include <random>
#include <iostream>
#include <chrono>
std::vector<int> filter_nth_element(std::vector<int> v, int n);
std::vector<int> filter_pqueue(std::vector<int> v, int n);
std::vector<int> filter_set(std::vector<int> v, int n);
std::vector<int> filter_deque(std::vector<int> v, int n);
std::vector<int> filter_vector(std::vector<int> v, int n);
struct stopclock {
typedef std::chrono::high_resolution_clock high_resolution_clock;
std::chrono::time_point<high_resolution_clock> start, end;
stopclock() : start(high_resolution_clock::now()) {}
~stopclock() {
using namespace std::chrono;
auto elapsed = high_resolution_clock::now() - start;
auto elapsed_ms = duration_cast<milliseconds>(elapsed);
std::cout << elapsed_ms.count() << " ";
}
};
int main() {
// randomly initialize input array
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<> dist;
std::vector<int> v(10000000);
for (auto &i: v)
i = dist(gen);
// run tests
for (std::vector<int>::size_type x = 5; x <= 100; x+=5) {
// keep this many values
std::cout << x << " ";
{
stopclock t;
auto result = filter_nth_element(v, x);
}
{
stopclock t;
auto result = filter_pqueue(v, x);
}
{
stopclock t;
auto result = filter_set(v, x);
}
{
stopclock t;
auto result = filter_deque(v, x);
}
{
stopclock t;
auto result = filter_vector(v, x);
}
std::cout << "\n";
}
}
And I found it quite interesting to see the relative performance of these approaches (compiled with -O3 - I think I have to think a bit about these results):

A binary search tree could be a suitable data structure for this problem. Maybe you can find a suitable Tree class in STL or Boost or so (try to look for that). Otherwise simply use a struct if you insist.
The struct would be like that:
struct tnode {        /* the tree node: */
    char *word;           /* points to the text */
    int count;            /* number of occurrences */
    struct tnode *left;   /* left child */
    struct tnode *right;  /* right child */
};
Taken from The C Programming Language, chapter 6.5 - Self-referential Structures. Just adapt it to your needs.
Though, I think if you want to program in C++ properly, try to create a Tree data structure (class) or try to use an existing one.
Considering that you only have 300 entries, that should do it.
In theory when the input data is random it is supposed to be efficient. But that is theory and does not really play a role in your case. I think it is a good solution.

You can use sorted buffer of 5 elements and on each step if item is higher than lowest item of the buffer, put item in buffer and evict lowest

Use a map of elements
First Create a class
class Data {
public:
std::string name;
int number;
};
typedef std::map< int, Data > DataItems;
DataItems largest;
If the size of largest is < 5, then you haven't read five elements.
if( largest.size() < 5 ) {
largest[ dt.number] = dt;
} else {
Otherwise - if it is larger than the smallest of the largest five, then the largest five has changed.
DataItems::iterator it = largest.begin(); // lowest current item.
if( it->second.number < dt.number ) { // is this item bigger? - yes
largest[ dt.number ] = dt; // add it (largest.size() == 6)
largest.erase( largest.begin() );// remove smallest item
}
}

You can use a set to keep track of the highest values. If you want to track non-unique numbers use a multiset instead:
vector<int> nums{10,11,12,1,2,3,4,5,6,7,8,9}; //example data
int k=5; // number of values to track
set<int> s; // this set will hold the result
for(auto a: nums)
{
if(s.size()<k)s.insert(a);
else if(a>*s.begin())
{
s.erase(s.begin());
s.insert(a);
}
}
Of course you will have to provide a custom comparison function for your struct.

I'm surprised nobody has mentioned priority queue data-structure that's made exactly for this
https://en.cppreference.com/w/cpp/container/priority_queue

Related

Erasing the first entry of a vector, after the maximum is reached

I have a vector in which i save coordinates.
I perform a series of calculations on each coordinate, thats why i have a limit for the vector size.
Right now i clear the vector, when the limit is reached.
I'm searching for a method, that let's me keep the previous values and only erases the very first value in the vector.
Simplified, something like this (if the maximum size of the vector would be 4).
vector<int> vec;
vec = {1,2,3,4}
vec.push_back(5);
vec = {2,3,4,5}
Is this possible?

As suggested by #paddy, you can use std::deque, it is most performant way to keep N elements if you .push_back(...) new (last) element, and .pop_front() first element.
std::deque gives O(1) complexity for such operations, unlike std::vector which gives O(N) complexity.
Try it online!
#include <deque>
#include <iostream>
int main() {
std::deque<int> d = {1, 2, 3, 4};
for (size_t i = 5; i <= 9; ++i) {
d.push_back(i);
d.pop_front();
// Print
for (auto x: d)
std::cout << x << " ";
std::cout << std::endl;
}
}
Output:
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
6 7 8 9

I think you should properly encapsulate this behaviour in your own vector class, as a std::vector wrapper. You could pass the max capacity as an argument to your constructor. And you could reimplement the methods that may cause "overflow" while just reusing the std::vector ones for the others.
To simplify what you pretend to achieve for the push_back case, using a function and a global variable, you could:
check against a max capacity and,
if that capacity is already reached, rotate your vector contents left by one position; then simply overwrite the last element;
otherwise do a normal push_back.
[Demo]
#include <algorithm> // rotate
#include <iostream> // cout
#include <vector>
const size_t max_capacity{4};
void push_back(std::vector<int>& v, int n)
{
if (v.size() == max_capacity)
{
// Rotate 1 left
std::rotate(std::begin(v), std::begin(v) + 1, std::end(v));
v[v.size() - 1] = n;
}
else
{
v.push_back(n);
}
}
int main()
{
std::vector<int> v{};
for (auto i{1}; i < 9; i++)
{
push_back(v, i);
for (auto&& n : v) { std::cout << n << " "; }
std::cout << "\n";
}
}

Sum of different numbers in an array

I want to have a function that returns the sum of different (non duplicate) values from an array: if I have {3, 3, 1, 5}, I want to have sum of 3 + 1 + 5 = 9.
My attempt was:
int sumdiff(int* t, int size){
int sum=0;
for (int i=0; i<=size;i++){
for(int j=i; j<=size;j++){
if(t[i]!=t[j])
sum=sum+t[i];
}
}
return sum;
}
int main()
{
int t[4]={3, 3, 1, 5};
cout << sumdiff(t, 4);
}
It returns 25 and I think I know why, but I do not know how to improve it. What should I change?

Put all the items in a set, then count them.
Sets are data structures that hold only one element of each value (i.e., each of their elements is unique; if you try to add the same value more than once, only one instance will be count).
You can take a look in this interesting question about the most elegant way of doing that for ints.

First of all, your loop should be for (int i=0; i<size;i++). Your actual code is accessing out of the bounds of the array.
Then, if you don't want to use STL containers and algorithms (but you should), you can modify your code as follows:
int sumdiff(int* t, int size){
int sum=0;
for (int i=0; i<size;i++){
// check if the value was previously added
bool should_sum = true;
for(int j=0; should_sum && j<i;j++){
if(t[i]==t[j])
should_sum = false;
}
if(should_sum)
sum=sum+t[i];
}
return sum;
}
int main()
{
int t[4]={3, 3, 1, 5};
cout << sumdiff(t, 4);
}

You could:
Store your array contents into an std::unordered_set first. By doing so, you'd essentially get rid of the duplicates automatically.
Then call std::accumulate to compute the sum

**wasthishelpful's answer was exactly what i was talking about. I saw his post after i posted mine.
So, you're trying to check the duplicate number using your inner loop.
However, your outer loop will loop 4 times no matter what which gives you wrong result.
Try,
Do only checking in inner loop. (use a flag to record if false)
Do your sum outside of inner loop. (do the sum when flag is true)

Here is another solution using std::accumulate, but it iterates over the original elements in the call to std::accumulate, and builds the set and keeps a running total as each number in the array is encountered:
#include <iostream>
#include <numeric>
#include <set>
int main()
{
int t[4] = { 3, 3, 1, 5 };
std::set<int> mySet;
int mySum = std::accumulate(std::begin(t), std::end(t), 0,
[&](int n, int n2){return n += mySet.insert(n2).second?n2:0;});
std::cout << "The sum is: " << mySum << std::endl;
return 0;
}
The way it works is that std::insert() will return a pair tbat determines if the item was inserted. The second of the pair is a bool that denotes whether the item was inserted in the set. We only add onto the total if the insertion is successful, otherwise we add 0.
Live Example

Insert array elements into a set and use the std::accumulate function:
#include <iostream>
#include <numeric>
#include <set>
int main()
{
int t[4] = { 3, 3, 1, 5 };
std::set<int> mySet(std::begin(t), std::end(t));
int mySum = std::accumulate(mySet.begin(), mySet.end(), 0);
std::cout << "The sum is: " << mySum << std::endl;
return 0;
}

Counting number of distinct integers in array

To find the number of distinct numbers in an array from the lth to the rth index, I wrote a code block like:
int a[1000000];
//statements to input n number of terms from user in a.. along with l and r
int count=r-l+1; //assuming all numbers to be distinct
for(; l<=r; l++){
for(int i=l+1; i<=r; i++){
if(a[l]==a[i]){
count--;
break;
}
}
}
cout<<count<<'\n';
Explanation
For an array say, a=5 6 1 1 3 2 5 7 1 2 of ten elements. If we wish to check the number of distinct numbers between a[1] and a[8] that is the second and the 9th elements (including both), The logic I tried to implement would first take count=8 (no. of elements to be considered) and then it starts from a[1] that is 6 and checks for any other 6 after it, if it does find, it decreases the count by one and goes on for the next number in the row. So that if there are any more occurrence of 6 after that one, it would not be included twice.
Problem I tried small test cases and it works. But when I tried with bigger data, it did not work, so I wanted to know where would my logic fail?
Bigger data, as in integrated with other parts of the program and then used. Which gave incorrect output

You can try to use std::set
Basic idea is to add all the elements into your new set, and just output the size of your set.
#include <iostream>
#include <vector>
#include <set>
using namespace std;
int main()
{
int l = 1, r = 6;
int arr[] = {1, 1, 2, 3, 4, 5, 5, 5, 5};
set<int> s(&arr[l], &arr[r + 1]);
cout << s.size() << endl;
return 0;
}

Here is an answer that does not use std::set, although that solution is probably simpler.
#include <algorithm>
#include <vector>
int main()
{
int input[10]{5, 6, 1, 1, 3, 2, 5, 7, 1, 2}; //because you like raw arrays, I guess?
std::vector<int> result(std::cbegin(input), std::cend(input)); //result now contains all of input
std::sort(std::begin(result), std::end(result)); //result now holds 1 1 1 2 2 3 5 5 6 7
result.erase(std::unique(std::begin(result), std::end(result)), std::end(result)); //result now holds 1 2 3 5 6 7
result.size(); //gives the count of distinct integers in the given array
}
Here it is live on Coliru if you're into that.
--
EDIT: Here, have a short version of the set solution, too.
#include <set>
int main()
{
int input[10]{5, 6, 1, 1, 3, 2, 5, 7, 1, 2}; //because you like raw arrays, I guess?
std::set<int> result(std::cbegin(input), std::cend(input));
result.size();
}

The first question to ask with this type of problem is what is the possible range of the values. if the range of numbers N is "reasonably small", then you can use a boolean array of size N to indicate whether the number corresponding to the index is present. You iterate from l to r, setting the flag, and if the flag was not already set increment a counter.
count = 0;
for(int i=l; i<=r; i++) {
if (! isthere[arr[i]]) {
count++;
isthere[arr[i]] = TRUE;
}
}
In terms of complexity, both this approach and the one based on set are O(n), but this one is faster as there is no hashing involved. For small N, for example for numbers between 0-255, most likely this is also likely to be less memory intensive. For larger N, for example if any 32-bit integers is allowed, the set based approach is more suitable.

You said you didn't mind another solution. So here it is. It uses set - a structure that stores only unique elements. By the way, on the bigger data - it will much faster than solution with two cycles.
set<int> a1;
for (int i = l; i <= r; i++)
{
a1.insert(a[i]);
}
cout << a1.size();

In the below process I'm giving process of counting unique numbers. In this technique you just get unique elements in an array. this process will update your array with garbage value. So in this process you can't use this array (that we will use) further anymore. This array will automatically resize with distinct elements.
#include <stdio.h>
#include <iostream>
#include <algorithm> // for using unique (library function)
int main(){
int arr[] = {1, 1, 2, 2, 3, 3};
int len = sizeof(arr)/sizeof(*arr); // finding size of arr (array)
int unique_sz = std:: unique(arr, arr + len)-arr; // Counting unique elements in arr (Array).
std:: cout << unique_sz << '\n'; // Printing number of unique elements in this array.
return 0;
}
If you want to handle that problem (That I told before), you can follow this process. You can handle this by coping your array in another array.
#include <stdio.h>
#include <iostream>
#include <algorithm> // for using copy & unique (library functions)
#include <string.h> // for using memcpy (library function)
int main(){
int arr[] = {1, 1, 2, 2, 3, 3};
int brr[100]; // we will copy arr (Array) to brr (Array)
int len = sizeof(arr)/sizeof(*arr); // finding size of arr (array)
std:: copy(arr, arr+len, brr); // which will work on C++ only (you have to use #include <algorithm>
memcpy(brr, arr, len*(sizeof(int))); // which will work on C only
int unique_sz = std:: unique(arr, arr+len)-arr; // Counting unique elements in arr (Array).
std:: cout << unique_sz << '\n'; // Printing number of unique elements in this array.
for(int i=0; i<len; i++){ // Here is your old array, that we store to brr (Array) from arr (Array).
std:: cout << brr[i] << " ";
}
return 0;
}

Personally, I'd just use standard algorithms
#include<algorithm>
#include <iostream>
int main()
{
int arr[] = {1, 1, 2, 3, 4, 5, 5, 5, 5};
int *end = arr + sizeof(arr)/sizeof(*arr);
std::sort(arr, end);
int *p = std::unique(arr, end);
std::cout << (int)(p - arr) << '\n';
}
This obviously relies on being allowed to modify the array (any duplicates are moved to the end of arr). But it is easy to create a copy of an array if needed and work on the copy.

TL;DR: Use this:
template<typename InputIt>
std::size_t countUniqueElements(InputIt first, InputIt last) {
using value_t = typename std::iterator_traits<InputIt>::value_type;
return std::unordered_set<value_t>(first, last).size();
}
There are two approaches:
Insert everything into a set, count the set. Because you don't care about the order you can use a std::unordered_set which will be faster than std::set. std::set is implemented as a tree which does a lot of allocations so it can be slow.
Use std::sort. If you want to preserve the original array you'll need to make a copy of it.
Here is a complete example.
#include <algorithm>
#include <cstdint>
#include <vector>
#include <unordered_set>
#include <iostream>
template<typename RandomIt>
std::size_t countUniqueElementsSort(RandomIt first, RandomIt last) {
if (first == last)
return 0;
std::sort(first, last);
std::size_t count = 1;
auto val = *first;
while (++first != last) {
if (*first != val) {
++count;
}
val = *first;
}
return count;
}
template<typename InputIt>
std::size_t countUniqueElementsSet(InputIt first, InputIt last) {
using value_t = typename std::iterator_traits<InputIt>::value_type;
return std::unordered_set<value_t>(first, last).size();
}
int main() {
std::vector<int> v = {1, 3, 4, 4, 3, 6};
std::cout << countUniqueElementsSet(v.begin(), v.end()) << "\n";
std::cout << countUniqueElementsSort(v.begin(), v.end()) << "\n";
int v2[] = {1, 3, 4, 4, 3, 6};
std::cout << countUniqueElementsSet(v2, v2 + 6) << "\n";
std::cout << countUniqueElementsSort(v2, v2 + 6) << "\n";
}
Using that loop in the sort version should be faster than std::unique.
The complexity of 2. is worse than 1. - the average case is O(N) vs O(N log N). But it avoids allocation so may end up being faster for small arrays or ones that are already sorted or mostly already sorted.
You should definitely not use std::set, and probably not use std::unique (though it does lead to fewer lines of code, and won't make that much difference to performance so up to you).
In any case, in most cases you should go with the set version - it's a lot simpler simpler and should be faster in almost all cases.
As other people have mentioned, if you know your input domain is small you can use a bool array instead of an unordered_set.

How to Create All Permutations of Variables from a Variable Number of STL Vectors [duplicate]

This question already has answers here:
Generate all combinations from multiple lists
(11 answers)
Closed 9 years ago.
I have a variable number of std::vectors<int>, let's say I have 3 vectors in this example:
std::vector<int> vect1 {1,2,3,4,5};
std::vector<int> vect2 {1,2,3,4,5};
std::vector<int> vect3 {1,2,3,4,5};
The values of the vectors are not important here. Also, the lengths of these vectors will be variable.
From these vectors, I want to create every permutation of vector values, so:
{1, 1, 1}
{1, 1, 2}
{1, 1, 3}
...
...
...
{3, 5, 5}
{4, 5, 5}
{5, 5, 5}
I will then insert each combination into a key-value pair map for further use with my application.
What is an efficient way to accomplish this? I would normally just use a for loop, and iterate across all parameters to create all combinations, but the number of vectors is variable.
Thank you.
Edit: I will include more specifics.
So, first off, I'm not really dealing with ints, but rather a custom object. ints are just for simplicity. The vectors themselves exist in a map like so std::map<std::string, std::vector<int> >.
My ultimate goal is to have an std::vector< std::map< std::string, int > >, which is essentially a collection of every possible combination of name-value pairs.

Many (perhaps most) problems of the form "I need to generate all permutations of X" can be solved by creative use of simple counting (and this is no exception).
Let's start with the simple example: 3 vectors of 5 elements apiece. For our answer we will view an index into these vectors as a 3-digit, base-5 number. Each digit of that number is an index into one of the vectors.
So, to generate all the combinations, we simply count from 0 to 53 (125). We convert each number into 3 base-5 digits, and use those digits as indices into the vectors to get a permutation. When we reach 125, we've enumerated all the permutations of those vectors.
Assuming the vectors are always of equal length, changing the length and/or number of vectors is just a matter of changing the number of digits and/or number base we use.
If the vectors are of unequal lengths, we simply produce a result in which not all of the digits are in the same base. For example, given three vectors of lengths 7, 4 and 10, we'd still count from 0 to 7x4x10 = 280. We'd generate the least significant digit as N%10. We'd generate the next least significant as (N/10)%4.
Presumably that's enough to make it fairly obvious how to extend the concept to an arbitrary number of vectors, each of arbitrary size.

0 - > 0,0,0
1 - > 0,0,1
2 - > 0,1,0
3 - > 0,1,1
4 - > 1,0,0
...
7 - > 1,1,1
8 - > 1,1,2
...
The map should translate a linear integer into a combination (ie: a1,a2,a3...an combination) that allows you to select one element from each vector to get the answer.
There is no need to copy any of the values from the initial vectors. You can use a mathematical formula to arrive at the right answer for each of the vectors. That formula will depend on some of the properties of your input vectors (how many are there? are they all the same length? how long are they? etc...)

Following may help: (https://ideone.com/1Xmc9b)
template <typename T>
bool increase(const std::vector<std::vector<T>>& v, std::vector<std::size_t>& it)
{
for (std::size_t i = 0, size = it.size(); i != size; ++i) {
const std::size_t index = size - 1 - i;
++it[index];
if (it[index] == v[index].size()) {
it[index] = 0;
} else {
return true;
}
}
return false;
}
template <typename T>
void do_job(const std::vector<std::vector<T>>& v, std::vector<std::size_t>& it)
{
// Print example.
for (std::size_t i = 0, size = v.size(); i != size; ++i) {
std::cout << v[i][it[i]] << " ";
}
std::cout << std::endl;
}
template <typename T>
void iterate(const std::vector<std::vector<T>>& v)
{
std::vector<std::size_t> it(v.size(), 0);
do {
do_job(v, it);
} while (increase(v, it));
}

This is an explicit implementation of what Lother and Jerry Coffin are describing, using the useful div function in a for loop to iterate through vectors of varying length.
#include <cstdlib> // ldiv
#include <iostream>
#include <map>
#include <string>
#include <vector>
using namespace std;
vector<int> vect1 {100,200};
vector<int> vect2 {10,20,30};
vector<int> vect3 {1,2,3,4};
typedef map<string,vector<int> > inputtype;
inputtype input;
vector< map<string,int> > output;
int main()
{
// init vectors
input["vect1"] = vect1;
input["vect2"] = vect2;
input["vect3"] = vect3;
long N = 1; // Total number of combinations
for( inputtype::iterator it = input.begin() ; it != input.end() ; ++it )
N *= it->second.size();
// Loop once for every combination to fill the output map.
for( long i=0 ; i<N ; ++i )
{
ldiv_t d = { i, 0 };
output.emplace_back();
for( inputtype::iterator it = input.begin() ; it != input.end() ; ++it )
{
d = ldiv( d.quot, input[it->first].size() );
output.back()[it->first] = input[it->first][d.rem];
}
}
// Sample output
cout << output[0]["vect1"] << endl; // 100
cout << output[0]["vect2"] << endl; // 10
cout << output[0]["vect3"] << endl; // 1
cout << output[N-1]["vect1"] << endl; // 200
cout << output[N-1]["vect2"] << endl; // 30
cout << output[N-1]["vect3"] << endl; // 4
return 0;
}

Use a vector array instead of separate variables. then use following recursive algorithm :-
permutations(i, k, vectors[], choices[]) {
if (i < k) {
for (int x = 0; x < vectors[i].size(); x++) {
choices[i] = x;
permutations(i + 1, k, vectors, choices);
}
} else {
printf("\n %d", vectors[choices[0]]);
for (int j = 1; j < k; j++) {
printf(",%d", vectors[choices[j]]);
}
}
}

Finding Frequency of numbers in a given group of numbers

Suppose we have a vector/array in C++ and we wish to count which of these N elements has maximum repetitive occurrences and output the highest count. Which algorithm is best suited for this job.
example:
int a = { 2, 456, 34, 3456, 2, 435, 2, 456, 2}
the output is 4 because 2 occurs 4 times. That is the maximum number of times 2 occurs.

Sort the array and then do a quick pass to count each number. The algorithm has O(N*logN) complexity.
Alternatively, create a hash table, using the number as the key. Store in the hashtable a counter for each element you've keyed. You'll be able to count all elements in one pass; however, the complexity of the algorithm now depends on the complexity of your hasing function.

Optimized for space:
Quicksort (for example) then iterate over the items, keeping track of largest count only.
At best O(N log N).
Optimized for speed:
Iterate over all elements, keeping track of the separate counts.
This algorithm will always be O(n).

If you have the RAM and your values are not too large, use counting sort.

A possible C++ implementation that makes use of STL could be:
#include <iostream>
#include <algorithm>
#include <map>
// functor
struct maxoccur
{
int _M_val;
int _M_rep;
maxoccur()
: _M_val(0),
_M_rep(0)
{}
void operator()(const std::pair<int,int> &e)
{
std::cout << "pair: " << e.first << " " << e.second << std::endl;
if ( _M_rep < e.second ) {
_M_val = e.first;
_M_rep = e.second;
}
}
};
int
main(int argc, char *argv[])
{
int a[] = {2,456,34,3456,2,435,2,456,2};
std::map<int,int> m;
// load the map
for(unsigned int i=0; i< sizeof(a)/sizeof(a[0]); i++)
m [a[i]]++;
// find the max occurence...
maxoccur ret = std::for_each(m.begin(), m.end(), maxoccur());
std::cout << "value:" << ret._M_val << " max repetition:" << ret._M_rep << std::endl;
return 0;
}

a bit of pseudo-code:
//split string into array firts
strsplit(numbers) //PHP function name to split a string into it's components
i=0
while( i < count(array))
{
if(isset(list[array[i]]))
{
list[array[i]]['count'] = list + 1
}
else
{
list[i]['count'] = 1
list[i]['number']
}
i=i+1
}
usort(list) //usort is a php function that sorts an array by its value not its key, Im assuming that you have something in c++ that does this
print list[0]['number'] //Should contain the most used number

The hash algorithm (build count[i] = #occurrences(i) in basically linear time) is very practical, but is theoretically not strictly O(n) because there could be hash collisions during the process.
An interesting special case of this question is the majority algorithm, where you want to find an element which is present in at least n/2 of the array entries, if any such element exists.
Here is a quick explanation, and a more detailed explanation of how to do this in linear time, without any sort of hash trickiness.

If the range of elements is large compared with the number of elements, I would, as others have said, just sort and scan. This is time n*log n and no additional space (maybe log n additional).
THe problem with the counting sort is that, if the range of values is large, it can take more time to initialize the count array than to sort.

Here's my complete, tested, version, using a std::tr1::unordered_map.
I make this approximately O(n). Firstly it iterates through the n input values to insert/update the counts in the unordered_map, then it does a partial_sort_copy which is O(n). 2*O(n) ~= O(n).
#include <unordered_map>
#include <vector>
#include <algorithm>
#include <iostream>
namespace {
// Only used in most_frequent but can't be a local class because of the member template
struct second_greater {
// Need to compare two (slightly) different types of pairs
template <typename PairA, typename PairB>
bool operator() (const PairA& a, const PairB& b) const
{ return a.second > b.second; }
};
}
template <typename Iter>
std::pair<typename std::iterator_traits<Iter>::value_type, unsigned int>
most_frequent(Iter begin, Iter end)
{
typedef typename std::iterator_traits<Iter>::value_type value_type;
typedef std::pair<value_type, unsigned int> result_type;
std::tr1::unordered_map<value_type, unsigned int> counts;
for(; begin != end; ++begin)
// This is safe because new entries in the map are defined to be initialized to 0 for
// built-in numeric types - no need to initialize them first
++ counts[*begin];
// Only need the top one at this point (could easily expand to top-n)
std::vector<result_type> top(1);
std::partial_sort_copy(counts.begin(), counts.end(),
top.begin(), top.end(), second_greater());
return top.front();
}
int main(int argc, char* argv[])
{
int a[] = { 2, 456, 34, 3456, 2, 435, 2, 456, 2 };
std::pair<int, unsigned int> m = most_frequent(a, a + (sizeof(a) / sizeof(a[0])));
std::cout << "most common = " << m.first << " (" << m.second << " instances)" << std::endl;
assert(m.first == 2);
assert(m.second == 4);
return 0;
}

It wil be in O(n)............ but the thing is the large no. of array can take another array with same size............
for(i=0;i
mar=count[o];
index=o;
for(i=0;i
then the output will be......... the element index is occured for max no. of times in this array........
here a[] is the data array where we need to search the max occurance of certain no. in an array.......
count[] having the count of each element..........
Note : we alrdy knw the range of datas will be in array..
say for eg. the datas in that array ranges from 1 to 100....... then have the count array of 100 elements to keep track, if its occured increament the indexed value by one........

Now, in the year 2022 we have
namespace aliases
more modern containers like std::unordered_map
CTAD (Class Template Argument Deduction)
range based for loops
using statment
the std::ranges library
more modern algorithms
projections
structured bindings
With that we can now write:
#include <iostream>
#include <vector>
#include <unordered_map>
#include <algorithm>
namespace rng = std::ranges;
int main() {
// Demo data
std::vector data{ 2, 456, 34, 3456, 2, 435, 2, 456, 2 };
// Count values
using Counter = std::unordered_map<decltype (data)::value_type, std::size_t> ;
Counter counter{}; for (const auto& d : data) counter[d]++;
// Get max
const auto& [value, count] = *rng::max_element(counter, {}, &Counter::value_type::second);
// Show output
std::cout << '\n' << value << " found " << count << " times\n";
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Keep track of highest 5 numbers during file input - c++

You can use sorted buffer of 5 elements and on each step if item is higher than lowest item of the buffer, put item in buffer and evict lowest

I'm surprised nobody has mentioned priority queue data-structure that's made exactly for this https://en.cppreference.com/w/cpp/container/priority_queue

Related

Erasing the first entry of a vector, after the maximum is reached

Sum of different numbers in an array

Counting number of distinct integers in array

How to Create All Permutations of Variables from a Variable Number of STL Vectors [duplicate]

Finding Frequency of numbers in a given group of numbers

Categories

Resources