vector not producing expected output c++ - c++

I am creating an idea bank to hold ideas inputted from a keyboard or from a txt file.
the idea follows the following pattern
ID:
Proposer:
keywords:
content:
i am then implementing an indexing algorithm for the idea bank using reverted index using the following struct
struct Index {
string key;
vector<int> idList;
};
where key represents a keyword in an idea and idList represent the ID for the idea.
I am then storing the index in an AVL tree.
here is my code to to create the inverted index.
void IdeaBank::AVLTreeIndexing(){
vector<string> kwords_vec;
vector<int> relevantIDs;
int foundIdIn;
string kword;
Index input;
for (int loop = 0; loop < newIdea.size(); loop++){
kwords_vec = newIdea[loop].getKeyword();
for (int i = 0; i < kwords_vec.size(); i++){
// goes through all ideas
for (int j = 0; j < newIdea.size(); j++){
if (newIdea[j].foundWordInBoth(kwords_vec[i])){
input.key = kwords_vec[i];
input.idList.push_back(newIdea[j].getID());
tree.AVL_Insert(input);
relevantIDs.push_back(input.idList[j]);
}// end of lookfor
}
}// end of kwords.size loop
}
}
my logic behind the above function is the following:
1) go through all the ideas and get the keyword
2) check if an idea contains the keyword
3) store the word as a key in my struct and store the ID in my vector in the struct
4) insert the struct into my avl tree
i am then trying to create a search function to print all the ideas that contain the word in their keyword. and this is where i believe i am having problems.
here is the code
void IdeaBank::searchQueryFromBank(string word){
Index index;
vector <int> test;
if (tree.AVL_Retrieve(word, index)){
cout << "found in tree"<<endl;
cout << "Relevant idea ID's for "
<< word << ":" << endl;
for (int i = 0; i < index.idList.size(); i++){
cout << index.idList[i] << endl;
test.push_back(index.idList[i]);
}
}
else
{
cout << "No relevant ideas found for " << word << endl;
}
cout << endl;
cout << "displaying the following Ideas"<<endl;
for (int i=0;i<test.size();i++)
{
displayIdeaByID(test[i]);
}
}
the problem i am having:
my ID list vector is being populated with numbers that dont contain the keyword.
for example say i have the two following ideas
ID: 1
Proposer: bob
keywords: computer,laptop
content: computer with built in microphone
ID: 2
Proposer: bob
keywords: smartphone
content: smartphone with built in microphone
if i was to search for the keyword "smartphone"
my result would print the following ID's
ID 0...
ID 0...
ID 1...
in my indexing function, the function foundWordInBoth is defined as
bool Idea::foundWordInBoth(string word){
if (find(keyword.begin(), keyword.end(), word) != keyword.end()){
return true;
}
size_t pos;
pos = content.find(word);
if (pos != string::npos)
{
return true;
}
return false;
the above function checks to see if a word is found in either the keyword or in the contents of the idea.
overall, i am unsure why it is printing out Ideas that do not contain a certain keyword

I would guess that the problem is here
void IdeaBank::AVLTreeIndexing(){
vector<string> kwords_vec;
vector<int> relevantIDs;
int foundIdIn;
string kword;
Index input;
for (int loop = 0; loop < newIdea.size(); loop++){
kwords_vec = newIdea[loop].getKeyword();
for (int i = 0; i < kwords_vec.size(); i++){
// goes through all ideas
for (int j = 0; j < newIdea.size(); j++){
if (newIdea[j].foundWordInBoth(kwords_vec[i])){
input.key = kwords_vec[i];
input.idList.push_back(newIdea[j].getID());
tree.AVL_Insert(input);
relevantIDs.push_back(input.idList[j]);
}// end of lookfor
}
}// end of kwords.size loop
}
}
You only declare one Index object, which you then push back IDs to for the entire execution of the function. So the list of IDs just builds and builds.
I'm finding the logic a little hard to follow because you only ever seem to have one ID for each key but clearly you need to move the declaration of input to some narrower scope.
Maybe it should look something like this (but really I'm guessing)
void IdeaBank::AVLTreeIndexing() {
for (int loop = 0; loop < newIdea.size(); loop++) {
vector<string> kwords_vec = newIdea[loop].getKeyword();
for (int i = 0; i < kwords_vec.size(); i++) {
Index input;
input.key = kwords_vec[i];
// goes through all ideas
for (int j = 0; j < newIdea.size(); j++) {
if (newIdea[j].foundWordInBoth(kwords_vec[i])) {
input.idList.push_back(newIdea[j].getID());
}// end of lookfor
}
tree.AVL_Insert(input);
}// end of kwords.size loop
}
}
In general get used to declaring variables where you need them, instead of declaring them all at the beginning of a function.

Related

2d array comparing with char

I have an array that reads data from a file, the data is binary digits such as 010011001001 and many others so the data are strings which I read in to my 2d array but I am stuck on comparing each value of the array to 0. Any help would be appreciated.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string myArr[5000][12];
int i = 0, zeroCount = 0, oneCount = 0;
ifstream inFile;
inFile.open("Day3.txt");
while(!inFile.eof())
{
for(int i = 0; i < 5000; i++)
{
for(int j = 0; j < 12; j++)
{
inFile >> myArr[i][j];
j++;
}
i++;
}
}
for(int j = 0; j < 12; j++)
{
for(int i = 0; i < 5000; i++)
{
if(myArr[i][j].compare("0") == 0)
{
zeroCount++;
}
else
{
oneCount++;
}
i++;
}
if(zeroCount > oneCount)
{
cout << "Gamma is zero for column " << i << endl;
}
else
{
cout << "Gamma is One for column " << i << endl;
}
j++;
}
}
some input from the text file:
010110011101
101100111000
100100000011
111000010001
001100010011
010000111100
Thank you for editing you question and providing more information. Now, we can help you. You have 2 major misunderstandings.
How does a for loop work?
What is a std::string in C++
Let us start with the for loop. You find an explanation in the CPP reference here. Or, you could look also at the tutorial shown here.
The for loop has basically 3 parts: for (part1; part2; part3). All are optional, you can use them, but no need to use them.
part1 is the init-statement. Here you can declare/define/initialize a variable. In your case it is int i = 0. You define a variable of data type int and initialize it with a value of 0
part2 is the condition. The loop will run, until the condition becomes false. The condition will be check at the beginning of the loop.
part3 is the so called iteration-expression. The term is a little bit misguiding. It is basically a statement that is executed at the end of the loop, before the next loop run will be executed and before the condition is checked again.
In Pseudo code it is something like this:
{
init-statement
while ( condition ) {
statement
iteration-expression ;
}
}
which means for the part of your code for(int j = 0; j < 12; j++)
{
int j = 0; // init-statement
while ( j < 12 ) { // while ( condition ) {
inFile >> myArr[i][j]; // Your loop statements
j++; // Your loop statements PROBLEM
j++; // iteration-expression from the for loop
}
}
And now you see the problem. You unfortunately increment 'j' twice. You do not need to do that. The last part3 of the for loop does this for you already.
So please delete the duplicated increment statements.
Next, the std::string
A string is, as its names says, a string of characters, or in the context of programming languages, an array of characters.
In C we used to write actually char[42] = "abc";. So using really a array of characters. The problem was always the fixed length of such a string. Here for example 42. In such an array you could store only 41 characters. If the string would be longer, then it could not work.
The inventors of C++ solved this problem. They created a dynamic character array, an array that can grow, if needed. They called this thing std::string. It does not have a predefined length. It will grow as needed.
Therefore, writing string myArr[5000][12]; shows that you did not fully understand this concept. You do not need [12], becuase the string can hold the 12 characters already. So, you can delete it. They characters will implicitely be there. And if you write inFile >> myString then the extractor operator >> will read characters from the stream until the next space and then store it in your myString variable, regardless how long the string is.
Please read this tutorial about strings.
That is a big advantage over the C-Style strings.
Then your code could look like:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string myArr[5000];
int zeroCount = 0, oneCount = 0;
ifstream inFile;
inFile.open("Day3.txt");
while (!inFile.eof())
{
for (int i = 0; i < 5000; i++)
{
inFile >> myArr[i];
}
}
for (int i = 0; i < 5000; i++)
{
zeroCount = 0; oneCount = 0;
for (int j = 0; j < 12; j++)
{
if (myArr[i][j]== '0')
{
zeroCount++;
}
else
{
oneCount++;
}
}
if (zeroCount > oneCount)
{
cout << "Gamma is zero for column " << i << endl;
}
else
{
cout << "Gamma is One for column " << i << endl;
}
}
}
But there is more. You use the magic number 5000 for your array of strings. This you do, because you think that 5000 is always big enough to hold all strings. But what, if not? If you have more than 5000 strings in your source file, then your code will crash.
Similar to the string problem for character arrays, we have also a array for any kind of data in C++, that can dynamically grow as needed. It is called std::vector and you can read about it here. A tutorial can be found here.
With that you can get rid of any C-Style array at all. But please continue to study the language C++ further and you will understand more and more.
Ther are more subtle problems in your code like while(!inFile.eof()), but this should be solved later.
I hope I could help

Is there a way to count the number of ocurrences of each element of a string array?

I have the following code that does exactly what I want. The problem is that I need the sample array to compare the strings and keep the count. Is there a way to count the number of occurrences of each string on any array without a sample?
For a little bit more context, the initial problem was to read data from a .txt file including vehicles information, like:
Volkswagen Jetta
Ford Focus
Volkswagen Jetta
And count the number of vehicles of each brand. Keep in mind that this is from an introductory course for programming, and we don't know how to use vectors or maps.
#include <iostream>
#include <string>
using namespace std;
using std::string;
#define MAX 20
int main(){
int counter[MAX];
string arr[MAX]={"ABC","AOE","ADC","ABC","ADC","ADC"};
string sample[MAX]={"ABC", "AOE", "ADC"};
for(int i=0; i<=MAX; i++){
counter[i]=0;
}
for(int i=0; i<MAX;i++){
for(int j=0; j<MAX; j++){
if (sample[i]==arr[j]){
counter[i]++;
}
}
}
for(int i=0; i<3;i++){
cout<< sample[i] << "=" << counter[i]<<endl;
}
return 0;
}
All you are expected to do is keep a list (an array will do) of brand names, and an array of counts for each name:
std::string brand_names[100];
int counts[100]; // number of times each element of brand_names[] was read from file
int num_items = 0;
Each time you read a brand name from file, try to find it in the array of strings. If found, just add one to the count at the same index. If not found, add it to the end of the brand_names[] array, add 1 to the end of the counts[] array, and increment num_items.
You do not need anything more than a simple loop for this:
an outer loop to read the next brand name from file
an inner loop to try to find the brand name in the list
If you want to solve this problem without knowing the initial values of the sample array:
Create an empty sample array. When you see new elements add them to this array.
Use a variable sample_size to keep track how many samples have been seen. Below is a simple example which doesn't use std::vector or dynamic allocation.
int main()
{
std::string arr[MAX] = { "ABC","AOE","ADC","ABC","ADC","ADC" };
std::string sample[MAX];
int sample_size = 0;
int counter[MAX] = { 0 };
for (int i = 0; i < MAX; i++)
{
if (arr[i].empty()) break;
bool sample_found = false;
for (int j = 0; j < sample_size; j++)
if (arr[i] == sample[j])
{
sample_found = true;
counter[j]++;
break;
}
if (!sample_found)
{
sample[sample_size] = arr[i];
counter[sample_size]++;
sample_size++;
}
}
for (int i = 0; i < sample_size; i++)
cout << sample[i] << "=" << counter[i] << std::endl;
return 0;
}

Find indexes of duplicated (repeated) numbers in an array

I have 2 arrays in which arr1 stores a number (the salary) and arr2 stores a string (the employee's name). Since the two arrays are linked, I cannot change the order of arr1, or sort it. I am looking for a more efficient way to solve the problem which is to find if there are any duplicates in the array. It might be more than one duplicate, but if no are found it should print "no duplicates found".
int count = 0;
for (int i = 0;i<arr_size ;i++)
{
for (int j = 0; j < arr_size && i != j; j++)
{
if (arr[i] == arr[j])
{
cout << arr2[i] << " " << arr1[i] << endl;
cout << arr2[j] << " " << arr1[j] << endl;
count ++;
}
}
}
if (count == 0)
{
cout << "No employee have same salaries"<<endl;
}
I don't want to use such an inefficient way to solve the problem. Is there any better suggestion? Thanks for the help :)
And the question also requires me to print out all the duplicated employee and salaries pair
You can use an unordered_set which has an average constant time insertion and retrieval:
#include <unordered_set>
// ...set up arr
int count = 0;
std::unordered_set<int> salaries;
for (int i = 0; i < arr_size; i ++) {
if (salaries.count(arr[i]) > 0) {
// it's a duplicate
}
salaries.insert(arr[i]);
}
// do more stuff
Create a Haspmap using unordered_map and store salaries and index of the salary .
Now if the same salary exist then increase count
You can reduce the time complexity of the algorithm to O(n) by using unordered_set on the expense of using additional space.
#include<unordered_set>
int main(){
// Initialise your arrays
unordered_set<string> unique;
bool flag = false;
for(int i=0;i<arr_size;i++){
// Since unordered_set does not support pair out of the box, we will convert the pair to string and use as a key
string key = to_string(arr1[i]) + arr2[i];
// Check if key exists in set
if(unique.find(key)!=unique.end())
unique.push(key);
else{
// mark that duplicate found
flag = true;
// Print the duplicate
cout<<"Duplicate: "+to_string(arr1[i])+"-"+arr2[i]<<endl;
}
}
if(!flag){
cout<<"No duplicates found"<<endl;
} else cout<<"Duplicates found"<<endl;
return 0;
}

String matching algorithm trying to correct it

I'm trying to do string matching algorithm a brute force method. but The algorithm is not working correctly, I get an out of bound index error.
here is my algorithm
int main() {
string s = "NOBODY_NOTICED_HIM";
string pattern="NOT";
int index = 0;
for (int i = 0; i < s.size();)
{
for (int j = 0; j < pattern.size();)
{
if(s[index] == pattern[j])
{
j++;
i++;
}
else
{
index = i;
j = 0;
}
}
}
cout<<index<<endl;
return 0;
}
FIXED VERSION
I fixed the out of bound exception. I don't know if the algorithm will work with different strings
int main() {
string s = "NOBODY_NOTICED_HIM";
string pattern="NOT";
int index = 0;
int i = 0;
while( i < s.size())
{
i++;
for (int j = 0; j < pattern.size();)
{
if(s[index] == pattern[j])
{
index++;
j++;
cout<<"i is " <<i << " j is "<<j <<endl;
}
else
{
index = i;
break;
}
}
}
cout<<i<<endl;
return 0;
}
Because the inner for loop has a condition to loop while j is less than pattern.size() but you are also incrementing i inside the body. When i goes out of bounds of s.size() then index also goes out of bounds and you'd get an OutOfBounds error.
The brute force method has to test the pattern with every possible subsequence. The main condition is the length, which has to be the same. All subsequence from s are:
['NOB', 'OBO', 'BOD', 'ODY', 'DY_', 'Y_N', 'NO', 'NOT', 'OTI', 'TIC',
'ICE', 'CED', 'ED', 'D_H', '_HI', 'HIM']
There are many ways to do it, you can do it char by char, or by using string operations like taking a substring. Both are nice excercises for learning.
Starting at zero in the s string you take the first three chars, compare to the pattern, and if equal you give the answer. Otherwise you move on to the char starting at one, etc.

How to apply a sort function to a string.find( ) to print results alphabetically?

I have a program that reads a text file into a struct (members- str author and str title) and gives the user the option to display all records in the file, search for an author, or search for a title. Now I need to integrate a sort function into this process, so that when the user searches by author or title the results are listed alphabetically.
The following sort function works perfectly with my showAll function, but I have absolutely no idea how I can modify my search functions to alphabetize the results.
Sort function code:
void sortByTitle(int counter){
//variable
int a, b, minIndex;
string temp;
for (a = 0; a < counter; a++){
minIndex = a;
for (b = a + 1; b < counter - 1; b++){
if (books[b].title < books[minIndex].title){
minIndex = b;
}
}
if(minIndex != a) {
temp = books[a].title;
books[a].title = books[minIndex].title;
books[minIndex].title = temp;
cout << books[a].title << endl;
}
}
}
And this is my current title search function:
int showBooksByTitle(int counter, string bookTitle){
int recordCount = 0;
//find the user-input string inside bookTitle
for (int a = 0; a < counter; a++){ //loop through the whole file
if (books[a].title.find(bookTitle) != string::npos){
//print a matching record
cout << books[a].title << " " << "(" << books[a].author << endl;
//keep track of the number of matching records
recordCount++;
}
}
return recordCount;
}
My assignment specifies these function headers, that the search functions return the number of records found, and that the data be read into a struct (rather than a vector). So I have to leave those aspects as they are.
How can I apply the selection sort to the search function so that I can print the records in order?
Any help would be much appreciated!
Here you set
smallestIndex = index;
then you check if
books[index].title < books[smallestIndex].title
It appears you are trying to implement a selection sort. A good snippet can be found here.
Furhtermore:
I don't see the need for declaring loc outside the for-loop. say for(int loc....
There is an abundance of sorting algorithms on wikipedia and you can use std::sort.
In the second line you shadow index with the same variable in the
for-loop.
I don't know how you store books, but if you would use std::vector you don't have to pass counter every time. You can just use books.size().