How to pick the max top 10 quantity from HashTable C++?

How to pick the max top 10 quantity from HashTable C++? - c++

I already wrote a working project but my problem is, the last part.I have already Read 500.000 row from csv file into vector, then put into the hashtable.I can print whole hashtable but I need to pick top 10 Quantity from my hashtable.Just be clear, I am not about to sort the whole hashtable, just pick top 10.
The topic of my project is,program must be able to store individual products (given with StockCode) from csv file and insert it into a suitable data structure. If that product is already inserted into the structure, its counter must be increased by the quantity of the order.After reading and processing is over, your program must list the “top 10” products ordered by individuals.
There is rule about the libraries, This will be a proper C++ class. You must be able to create many instances of this class. (Please use no third party libraries and C++ STL, Boost etc.) However, you can use, iostream, ctime, fstream, string like IO and string classes.
Important note: Only thing I should focus is speed, storage or size is not a problem.
What I've done so far is,
Read Csv file row by row into vector
Stockcodes in row[1], Quantity in row[3]
Put them into Hashtable and increase their quantity by the quantity of the order.
Print the whole hash table.
What I need to do is,
Print the Top 10 Quantity
Now let's share Example csv file, Driver program codes, Output of the print function.
Csv File look like this:
InvoiceNo;StockCode;Description;Quantity;
536365;85123A;WHITE HANGING HEART T-LIGHT HOLDER;6
536365;71053;WHITE METAL LANTERN;6;
536365;84029G;KNITTED UNION FLAG HOT WATER BOTTLE;6;
536365;84029E;RED WOOLLY HOTTIE WHITE HEART.;6;
536365;22752;SET 7 BABUSHKA NESTING BOXES;2;
536365;21730;GLASS STAR FROSTED T-LIGHT HOLDER;6;
main.cpp
void printMaxQuantity() {
int maxValue=0;
for (int i = 0; i < 1000000; ++i) {
if(table[i] != nullptr) {
if (table[i]->quantity > maxValue)
maxValue = table[i]->quantity;
if (table[i]->quantity == maxValue) {
cout << "Index: " << i << endl;
cout << "StockCode: " << table[i]->stockCode << endl;
cout << "Quantity: " << table[i]->quantity << endl;
cout << endl << endl;
}
}
}
}
};
Here the output:(After edit the code StockCode: 85123A is correct output, but still struggling about the top 10)
Index: 41240
StockCode: 10002
Quantity: 48
Index: 309193
StockCode: 85123A
Quantity: 72
Process finished with exit code 0
Also one last note, I am doing this for a school project so I shouldn't use any third party software or include any different libraries because it is not allowed (I will implement my own vector class later)

Since this is homework, I will avoid writing actual code. Since you do not have any prior information about the actual data set, you will need to loop through it, which is a linear complexity. In order to find the top 10 items I advise you to create an array of 10 items to store the best items you get so far.
The first step is to copy the first 10 elements into your array.
The second step is sort your array of 10 items descendingly, so you will always use the last item for comparison.
Now you can loop the big structure and on each step, compare the current item with the last one of the array of ten elements. If it's lower, then do nothing. If it's higher, then find the highest ranked item in your array of 10 items which is smaller than the item you intend to insert due to higher quality. When you find that item, loop from the end until this item until your array of ten elements and on each step override the curret element with the current one. Finally override the now duplicate element.
Example: Assuming that your 7th element has lower quality than the one you intend to insert, but the 6th has higher quality override 9th element with the 8th, then the 8th with the 7th and then the 7th with the item you just found. Remember that array indexes start from 0.

This is what you want clearly.This code pick the top 10 from your hash table
void hashTable::printTopTen() {
int maxValue = 0;
int indexHolder = 0;
cout << "#" << " " << "Stock Code" << "\t" << "Description" << "\t\t\t" << "Quantity" << endl;
for (int i = 0; i < 10; ++i) { //-> This loop for top 10
for (int index = 0; index < TABLE_SIZE; ++index) { //-> base loop to find max quantity in hash table
if (table[index] != nullptr) { // to check if index is NULL or not
if (table[index]->quantity > maxValue) {
maxValue = table[index]->quantity; //update the maxValue with biggest quantity
indexHolder = index; // -> to store index number of max quantity in hash table
}
}
}
for (int indeX = 0; indeX < TABLE_SIZE; ++indeX) { //find the max quantity's stockCode,description
if (table[indeX] != nullptr) { // to check if index is NULL or not
if (table[indeX]->quantity == maxValue) { //if we have reached the maxValue then it's quantity is top 1
cout << i + 1 << "." << " " << table[indeX]->stockCode << "\t" << table[indeX]->description
<< "\t" << table[indeX]->quantity << endl;
table[indexHolder]->quantity = 0; //after cout the max one, delete the index so it can't be top 1 again
}
}
}
maxValue = 0; // update max value 0 again for second base loop
}
}

This question already has an answer. But I want to show you how to perform selection sort so you can compare it with your code.
**Performance trick: ** Quick Sort algorithm can be used instead of Selection Sort
hashMap=hashTable, hashEntry=Node, so this what I did:
void hashTable::selectionSort() {
int firstCounter, secondCounter;
Node *emptyOne = new Node("empty", "thisEmpty", 0);
Node *temp;
for (firstCounter = 1; firstCounter < TABLE_SIZE; firstCounter++) {
if (table[firstCounter] == nullptr) {
table[firstCounter] = emptyOne;
}
temp = table[firstCounter];
secondCounter = firstCounter - 1;
if (table[secondCounter] == nullptr) {
table[secondCounter] = emptyOne;
}
while (secondCounter >= 0 && table[secondCounter]->quantity > temp->quantity) {
table[secondCounter + 1] = table[secondCounter];
secondCounter = secondCounter - 1;
if (table[secondCounter] == nullptr) {
table[secondCounter] = emptyOne;
}
}
table[secondCounter + 1] = temp;
}
}

Related

Why isn't this x value increasing in this map for loop? c++

The purpose of this loop is to look through a 2d vector and count the frequency in which a value in the first column appears. If the value shows up all three times, then it is good to go. If it doesn't then I would like to delete the row that it's in from the vector. The "it" iterator stores the value as (value, frequency).
I can't figure out how to delete the row at this point though, i have been trying to use a counter "x" in the second for loop so that it can keep track of which row it is on, but when i run it through the debugger the x doesn't increment. What ends up happening is the vector deletes the first rows instead of the rows that make the if statement true.
Why isn't the "x" incrementing? Is there a different method i could use to keep track of which row the loop is currently in?
"data" is the 2d vector.
for (int i = 0; i < data.size(); i++) // Process the matrix.
{
occurrences[data[i][0]]++;
}
for (map<string, unsigned int>::iterator it = occurrences.begin(); it != occurrences.end(); ++it)
{
int x = 0;
if ((*it).second < 3) // if the value doesn't show up three times, erase it
{
data.erase(data.begin() + x);
}
cout << setw(3) << (*it).first << " ---> " << (*it).second << endl; // show results
x++;
}

You reset x back to 0 every loop. Initialize it outside the loop and it should work.
int x = 0;

You have to initialize x outside the for loop. If you declare it in the for loop it will be set to 0 every time. You current program deletes the first element each time because the x is always zero here: data.erase(data.begin() + x);
for (int i = 0; i < data.size(); i++) // Process the matrix.
{
occurrences[data[i][0]]++;
}
int x = 0;
for (map<string, unsigned int>::iterator it = occurrences.begin(); it != occurrences.end(); ++it)
{
if ((*it).second < 3) // if the value doesn't show up three times, erase it
{
data.erase(data.begin() + x);
}
cout << setw(3) << (*it).first << " ---> " << (*it).second << endl; // show results
x++;
}

Program only works with inclusion of (side effects free) cout statements?

So I've been working on problem 15 from the Project Euler's website , and my solution was working great up until I decided to remove the cout statements I was using for debugging while writing the code. My solution works by generating Pascal's Triangle in a 1D array and finding the element that corresponds to the number of paths in the NxN lattice specified by the user. Here is my program:
#include <iostream>
using namespace std;
//Returns sum of first n natural numbers
int sumOfNaturals(const int n)
{
int sum = 0;
for (int i = 0; i <= n; i++)
{
sum += i;
}
return sum;
}
void latticePascal(const int x, const int y, int &size)
{
int numRows = 0;
int sum = sumOfNaturals(x + y + 1);
numRows = x + y + 1;
//Create array of size (sum of first x + y + 1 natural numbers) to hold all elements in P's T
unsigned long long *pascalsTriangle = new unsigned long long[sum];
size = sum;
//Initialize all elements to 0
for (int i = 0; i < sum; i++)
{
pascalsTriangle[i] = 0;
}
//Initialize top of P's T to 1
pascalsTriangle[0] = 1;
cout << "row 1:\n" << "pascalsTriangle[0] = " << 1 << "\n\n"; // <--------------------------------------------------------------------------------
//Iterate once for each row of P's T that is going to be generated
for (int i = 1; i <= numRows; i++)
{
int counter = 0;
//Initialize end of current row of P's T to 1
pascalsTriangle[sumOfNaturals(i + 1) - 1] = 1;
cout << "row " << i + 1 << endl; // <--------------------------------------------------------------------------------------------------------
//Iterate once for each element of current row of P's T
for (int j = sumOfNaturals(i); j < sumOfNaturals(i + 1); j++)
{
//Current element of P's T is not one of the row's ending 1s
if (j != sumOfNaturals(i) && j != (sumOfNaturals(i + 1)) - 1)
{
pascalsTriangle[j] = pascalsTriangle[sumOfNaturals(i - 1) + counter] + pascalsTriangle[sumOfNaturals(i - 1) + counter + 1];
cout << "pascalsTriangle[" << j << "] = " << pascalsTriangle[j] << '\n'; // <--------------------------------------------------------
counter++;
}
//Current element of P's T is one of the row's ending 1s
else
{
pascalsTriangle[j] = 1;
cout << "pascalsTriangle[" << j << "] = " << pascalsTriangle[j] << '\n'; // <---------------------------------------------------------
}
}
cout << endl;
}
cout << "Number of SE paths in a " << x << "x" << y << " lattice: " << pascalsTriangle[sumOfNaturals(x + y) + (((sumOfNaturals(x + y + 1) - 1) - sumOfNaturals(x + y)) / 2)] << endl;
delete[] pascalsTriangle;
return;
}
int main()
{
int size = 0, dim1 = 0, dim2 = 0;
cout << "Enter dimension 1 for lattice grid: ";
cin >> dim1;
cout << "Enter dimension 2 for lattice grid: ";
cin >> dim2;
latticePascal(dim1, dim2, size);
return 0;
}
The cout statements that seem to be saving my program are marked with commented arrows. It seems to work as long as any of these lines are included. If all of these statements are removed, then the program will print: "Number of SE paths in a " and then hang for a couple of seconds before terminating without printing the answer. I want this program to be as clean as possible and to simply output the answer without having to print the entire contents of the triangle, so it is not working as intended in its current state.

There's a good chance that either the expression to calculate the array index or the one to calculate the array size for allocation causes undefined behaviour, for example, a stack overflow.
Because the visibility of this undefined behaviour to you is not defined the program can work as you intended or it can do something else - which could explain why it works with one compiler but not another.
You could use a vector with vector::resize() and vector::at() instead of an array with new and [] to get some improved information in the case that the program aborts before writing or flushing all of its output due to an invalid memory access.
If the problem is due to an invalid index being used then vector::at() will raise an exception which you won't catch and many debuggers will stop when they find this pair of factors together and they'll help you to inspect the point in the program where the problem occurred and key facts like which index you were trying to access and the contents of the variables.
They'll typically show you more "stack frames" than you expect but some are internal details of how the system manages uncaught exceptions and you should expect that the debugger helps you to find the stack frame relevant to your problem evolving so you can inspect the context of that one.

Your program works well with g++ on Linux:
$ g++ -o main pascal.cpp
$ ./main
Enter dimension 1 for lattice grid: 3
Enter dimension 2 for lattice grid: 4
Number of SE paths in a 3x4 lattice: 35
There's got to be something else since your cout statements have no side effects.
Here's an idea on how to debug this: open 2 visual studio instances, one will have the version without the cout statements, and the other one will have the version with them. Simply do a step by step debug to find the first difference between them. My guess is that you will realize that the cout statements have nothing to do with the error.

histogram program gives strange output C++

I have been writing code to produce a horizontal histogram. This program takes user input of any range of numbers into a vector. Then it asks the user for the lowest value they want the histogram to begin at, and how big they want each bin to be. For example:
if lowestValue = 1 and binSize = 20
and vector is filled with values {1, 2, 3, 20, 30, 40, 50} it would print something like:
(bin) (bars) (num)(percent)
[ 1-21) #### 4 57%
[21-41) ## 2 28%
[41-61) ## 2 28%
Here is most of the code that does so:
void printHistogram(int lowestValue, int binSize, vector<double> v)
{
int binFloor = lowestValue, binCeiling = 0;
int numBins = amountOfBins(binSize, (int)range(v));
for (int i = 0; i<=numBins; i++)
{
binCeiling = binFloor+binSize;
int amoInBin = amountInBin(v,binFloor, binSize);
double perInBin = percentInBin(v, amoInBin);
if (binFloor < 10)
{
cout << "[ " << binFloor << '-' << binCeiling << ") " << setw(20) << left << formatBars(perInBin) << ' ' << amoInBin << ' '<< setprecision(4) << perInBin << '%' << endl;
binFloor += binSize;
}
else
{
cout << '[' << binFloor << '-' << binCeiling << ") " << setw(20) << left << formatBars(perInBin) << ' ' << amoInBin << ' '<< setprecision(4) << perInBin << '%' << endl;
binFloor += binSize;
}
}
}
and the function that counts how many terms are in each bin:
int amountInBin(vector<double> v, int lowestBinValue, int binSize)
{
int count = 0;
for (size_t i; i<v.size(); i++)
{
if (v[i] >= lowestBinValue && v[i] < (lowestBinValue+binSize))
count += 1;
}
return count;
}
Now my issue:
For some reason, it is not counting values between 20-40. At least as far as I can see from my testing. Here is an image of a run:
Any help is appreciated.

I would suggest a different approach. Making two passes, first calculating the number of bins, then another pass to add them up, looks fragile, and error-prone. Not really surprise to see you trying to figure out a bug of this kind. I think your original approach is too complicated.
As the saying goes "the more you overthink the plumbing, the easier it is to stop up the drain". Find the simplest way to do something, and it will have the least amount of surprises and gotchas, to deal with.
I think it's simpler to make a single pass over the values, calculating which bin each value belongs to, and counting the number of values seen per bin. Let's use a std::map, keyed by bin number, with the value being the number of values in each bin.
void printHistogram(int lowestValue, int binSize, const std::vector<double> &v)
{
std::map<int, size_t> histogram;
for (auto value:v)
{
int bin_number= value < lowestValue ? 0:(value-lowestValue)/binSize;
++histogram[bin_number];
}
And ...that's it. histogram is now your histogram. histogram[0] is now the number of values in the first bin, [lowestValue, lowestValue+binSize), which also includes all values less than lowestValue. histogram[1] will be the number of values found for the next bin, and so on.
Now, you just have to iterate over the histogram map, and generate your actual histogram.
Now, the tricky part here is that the histogram map will only include keys for which at least 1 value was found. If no value was dropped into the bin, the map will not include the bin number. So, if there were no values in the first bin, histogram[0] won't even exist, the first value in the map will be the bin for the lowest value in the vector.
This isn't such a difficult problem to solve, by iterating over the map with a little bit of extra intelligence:
int next_bin_number=0;
for (auto b=histogram.begin(); b != histogram.end(); b++)
{
while (next_bin_number < b->first)
{
// next_bin_number had 0 values. Print the histogram row
// for bin #next_bin_number, showing 0 values in it.
++next_bin_number;
}
int n_values=b->second;
// Bin #n_next_number, with n_values, print its histogram row
++next_bin_number;
}

The code in the loop doesn't initialize i, so the results are at best unpredictable.

Problems with string comparing with pointer

I have three lists and I want to implement a search feature.
How the code works is that I create an iterator that begins at the start of each list and it compares what the user inputs with each and every value in the list, when it finds a match it is supposed to increase an integer variable by one, so in the end it would say:
your value is found: <x amount of times in Example list>
The problem I am having is that it is compiling fine but the end result still gives me 0 like it didn't increment the variable.
I am wondering if it is having trouble comparing the value where the iterator is pointing to the user input, can anyone please shed some light on this? For testing purposes in the
On the iterator search_disregard I manually put 4 identical values in the list, so I know the end result should show me 4, but I still get 0:
cout << "\nSearch for: ";
string edit_search;
cin >> edit_search;
list<string>::iterator search_disregard = disregard_list.begin();
list<string>::iterator search_compare = compare_list.begin();
int search_disregard_count = 0;
int search_compare_count = 0;
for (int x = 0; x < disregard_list.size(); ++x)
{
if (*search_disregard == edit_search)
{
++search_disregard_count;
}
}
for (int x = 0; x < compare_list.size(); ++x)
{
if (*search_compare == edit_search)
{
++search_compare_count;
}
}
cout << edit_tag << edit_search << " is found in the following: \n" << endl;
cout << search_disregard_count << " time(s) in the Disregard List" << endl;
cout << search_compare_count << " time(s) in the Compare List" << endl;
buffer_clear();

You never increment your iterators so they will always point to the first element. The idiomatic way:
for(auto it = container.begin(); it != container.end(); ++it) ...

Index/Max/Min of a Vector<double> C++

I want to be able to figure out the highest and lowest element in an vector and also figure out what position/index that high/low number is currently.
For example,
vector<double> x;
std::cout << "Enter in #s: ";
double numbers;
std::getline(std::cin, numbers);
x.push_back(numbers);
Let's say the user inputted 4.3 1.0 2.99 43.5
I would want the result to say
The highest number is 43.5 at position 4
The lowest number is 1.0 at position 2
I was wondering if there is any way to implement this code WITHOUT using the min_element/max_element function and do it with a for loop?
I wanted to use something like:
for (int i=0;i < x.size();i++)
if ( //the number is less than ) {
std::cout << "The lowest number is...... at position .....";
if ( //the number is greather than ) {
std::cout << "The highest number is......at position......";

Compare each number to the best max / min found so far.
If it is bigger/smaller replace the max/min with it and note the index
You will need a max and min variable and two indexes - be carefull what you set the initial value if your max and min to

For that, you need to store both the indices of the highest and lowest elements, comparing them with the current element for each iteration.
// Note: the below code assumes that the container (vector) is not empty
// you SHOULD check if the vector contains some elements before executing the code below
int hi, lo; // These are indices pointing to the highest and lowest elements
hi = lo = 0; // Set hi and lo to the first element's index
// Then compare the elements indexed by hi and lo with the rest of the elements
for (int i = 1;i < x.size();i++) {
if(x[i] < x[lo]) {
// The element indexed by i is less than the element indexed by lo
// so set the index of the current lowest element to i
lo = i;
}
// Below, else if is used and not if because the conditions cannot be both true
else if(x[i] > x[hi]) {
// Same logic as the above, only for the highest element
hi = i;
}
}
// Note: the position indicated by the output below will be 0-based
std::cout << "The lowest number is " << x[lo] << " at position " << lo << ".\n";
std::cout << "The highest number is " << x[hi] << " at position " << hi << ".\n";
LIVE DEMO

size_t iMax=0,iMin=0;
for(size_t i=1; i<x.size(); ++i)
{
if(x[iMax] < x[i])
iMax=i;
if(x[iMin] > x[i])
iMin=i;
}
//iMax is index of the biggest num in the array

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to pick the max top 10 quantity from HashTable C++? - c++

Related

Why isn't this x value increasing in this map for loop? c++

Program only works with inclusion of (side effects free) cout statements?

histogram program gives strange output C++

Problems with string comparing with pointer

Index/Max/Min of a Vector<double> C++

Categories

Resources