A vector of vector problems - c++

I'm trying to loop through a list of strings and find where a given character is located at in said string. I then store the string in a given vector based on where/if the character occurs. I'm getting a runtime error in the following code before the loop finishes executing. I've looked over the it half a dozen times already and can't seem to find anything wrong.
vector< vector<string> > p;
for(list< string >::iterator ix = dictionary.begin(); ix != dictionary.end(); ix++)
{
int index = contains(*ix, guess);
index++;
p.at(index).push_back(*ix); //0 will contain all the words that do not contain the letter
//1 will be the words that start with the char
//2 will be the words that contain the the char as the second letter
//etc...
}
int contains(string str, char c)
{
char *a = (char *)str.c_str();
for(int i = 0; i < (str.size() + 1); i++)
{
if(a[i] == c)
return i;
}
return -1;
}

Change
(str.size() + 1)
...to
str.size()
You would be in undefined territory at str.size(), let alone that PLUS one.
For that matter, why are you fiddling with the extra char* instead of std::string[]?
For THAT matter, why don't you simply use std::string::find()?
That is, of course, assuming you're using std::string and not some other string... :)
In fact, back to the call site... string::find() returns the index of where the target character matched, or string::npos if NOT matched. So, can you dispense with the extra function altogether?
int pos = (*ix).find( guess );
p.at( ( pos == string::npos ) ? 0 : ( pos + 1 ) ).push_back( *ix );

vector< vector > p defines p as empty vector. You must have vector elements added to it before using vector::at().
For example:
const size_t MAX_LETTERS_IN_WORD = 30;
vector< vector<string> > p(MAX_LETTERS_IN_WORD);
/* same as before */
As an alternative you can check p.size() before using at() and push_back() additional elements into p as needed

The problem with the runtime error, might be because you access the vector p at a position that doesn't exist yet. You have to make space in the vector before you access a specific index.

Related

C++ string parser issues

Ok, so I'm working on a homework project in C++ and am running into an issue, and can't seem to find a way around it. The function is supposed to break an input string at user-defined delimiters and store the substrings in a vector to be accessed later. I think I got the basic parser figured out, but it doesn't want to split the last part of the input.
int main() {
string input = "comma-delim-delim&delim-delim";
vector<string> result;
vector<char> delims;
delims.push_back('-');
delims.push_back('&');
int begin = 0;
for (int i = begin; i < input.length(); i++ ){
for(int j = 0; j < delims.size(); j++){
if(input.at(i) == delims.at(j)){
//Compares chars in delim vector to current char in string, and
//creates a substring from the beginning to the current position
//minus 1, to account for the current char being a delimiter.
string subString = input.substr(begin, (i - begin));
result.push_back(subString);
begin = i + 1;
}
The above code works fine for splitting the input code up until the last dash. Anything after that, because it doesn't run into another delimiter, it won't save as a substring and push into the result vector. So in an attempt to rectify the matter, I put together the following:
else if(input.at(i) == input.at(input.length())){
string subString = input.substr(begin, (input.length() - begin));
result.push_back(subString);
}
However, I keep getting out of bounds errors with the above portion. It seems to be having an issue with the boundaries for splitting the substring, and I can't figure out how to get around it. Any help?
In your code you have to remember that .size() is going to be 1 more than your last index because it starts at 0. so an array of size 1 is indexed at [0]. so if you do input.at(input.length()) will always overflow by 1 place. input.at(input.length()-1) is the last element. here is an example that is working for me. After your loops just grab the last piece of the string.
if(begin != input.length()){
string subString = input.substr(begin,(input.length()-begin));
result.push_back(subString);
}
Working from the code in the question I've substituted iterators so that we can check for the end() of the input:
int main() {
string input = "comma-delim-delim&delim-delim";
vector<string> result;
vector<char> delims;
delims.push_back('-');
delims.push_back('&');
auto begin = input.begin(); // use iterator
for(auto ii = input.begin(); ii <= input.end(); ii++){
for(auto j : delims) {
if(ii == input.end() || *ii == j){
string subString(begin,ii); // can construct string from iterators, of if ii is at end
result.push_back(subString);
if(ii != input.end())
begin = ii + 1;
else
goto done;
}
}
}
done:
return 0;
}
This program uses std::find_first_of to parse the multiple delimiters:
int main() {
string input = "comma-delim-delim&delim-delim";
vector<string> result;
vector<char> delims;
delims.push_back('-');
delims.push_back('&');
auto begin = input.begin(); // use iterator
for(;;) {
auto next = find_first_of(begin, input.end(), delims.begin(), delims.end());
string subString(begin, next); // can construct string from iterators
result.push_back(subString);
if(next == input.end())
break;
begin = next + 1;
}
}

C++ Dynamic Array Inputs

I am using two dynamic arrays to read from a file. They are to keep track of each word and the amount of times it appears. If it has already appeared, I must keep track in one array and not add it into the other array since it already exists. However, I am getting blank spaces in my array when I meet a duplicate. I think its because my pointer continues to advance, but really it shouldn't. I do not know how to combat this. The only way I have was to use a continue; when I print out the results if the array content = ""; if (*(words + i) == "") continue;. This basically ignores those blanks in the array. But I think that is messy. I just want to figure out how to move the pointer back in this method. words and frequency are my dynamic arrays.
I would like guidance in what my problem is, rather than solutions.
I have now changed my outer loop to be a while loop, and only increment when I have found the word. Thank you WhozCraig and poljpocket.
Now this occurs.
Instead of incrementing your loop variable [i] every loop, you need to only increment it when a NEW word is found [i.e. not one already in the words array].
Also, you're wasting time in your inner loop by looping through your entire words array, since words will only exist up to index i.
int idx = 0;
while (file >> hold && idx < count) {
if (!valid_word(hold)) {
continue;
}
// You don't need to check past idx because you
// only have <idx> words so far.
for (int i = 0; i < idx; i++) {
if (toLower(words[i]) == toLower(hold)) {
frequency[i]++;
isFound = true;
break;
}
}
if (!isFound) {
words[idx] = hold;
frequency[idx] = 1;
idx++;
}
isFound = false;
}
First, to address your code, this is what it should probably look like. Note how we only increment i as we add words, and we only ever scan the words we've already added for duplicates. Note also how the first pass will skip the j-loop entirely and simply insert the first word with a frequency of 1.
void addWords(const std::string& fname, int count, string *words, int *frequency)
{
std::ifstream file(fname);
std::string hold;
int i = 0;
while (i < count && (file >> hold))
{
int j = 0;
for (; j<i; ++j)
{
if (toLower(words[j]) == toLower(hold))
{
// found a duplicate at j
++frequency[j];
break;
}
}
if (j == i)
{
// didn't find a duplicate
words[i] = hold;
frequency[i] = 1;
++i;
}
}
}
Second, to really address your code, this is what it should actually look like:
#include <iostream>
#include <fstream>
#include <map>
#include <string>
//
// Your implementation of toLower() goes here.
//
typedef std::map<std::string, unsigned int> WordMap;
WordMap addWords(const std::string& fname)
{
WordMap words;
std::ifstream inf(fname);
std::string word;
while (inf >> word)
++words[toLower(word)];
return words;
}
If it isn't obvious by now how a std::map<> makes this task easier, it never will be.
check out SEEK_CUR(). If you want to set the cursor back
The problem is a logical one, consider several situations:
Your algorithm does not find the current word. It is inserted at position i of your arrays.
Your algorithm does find the word. The frequency of the word is incremented along with i, which leaves you with blank entries in your arrays whenever there's a word which is already present.
To conclude, 1 works as expected but 2 doesn't.
My advice is that you don't rely on for loops to traverse the string but use a "get-next-until-end" approach which uses a while loop. With this, you can track your next insertion point and thus get rid of the blank entries.
int currentCount = 0;
while (file)
{
// your inner for loop
if (!found)
{
*(words + currentCount) = hold;
*(frequency + currentCount) = 1;
currentCount++;
}
}
Why not use a std::map?
void collect( std::string name, std::map<std::string,int> & freq ){
std::ifstream file;
file.open(name.c_str(), std::ifstream::in );
std::string word;
while( true ){
file >> word; // add toLower
if( file.eof() ) break;
freq[word]++;
}
file.close();
}
The problem with your solution is the use of count in the inner loop where you look for duplicates. You'll need another variable, say nocc, initially 0, used as limit in the inner loop and incremented whenever you add another word that hasn't been seen yet.

Counting words in a c string

I need help completing this function so that it correctly returns the the number of words in the c-string. Maybe my logic is wrong ?
#include <iostream>
#include <string>
#include <cctype>
int countwords(char *, int);
using namespace std;
int main()
{
char a[] = "Four score and seven";
int size = sizeof(a)/sizeof(char);
cout << countwords(a,size);
return 0;
}
int countwords(char* a, int size){
int j = 0;
for(int i = 0; i < size; i++){
if(isspace(i) and isalnum(i - 1) and isalnum(i + 1))
j++;
}
return j;
}
You are passing the value of i to these functions instead of a[i]. That means you're testing if your loop variable is a space (for example), rather than the character at that position in the a array.
Once you have fixed that, understand that you can't blindly reference a[i-1] in that loop (because of the possibility of accessing a[-1]. You will need to update your logic (note also you must use && for logical AND, not and).
I suggest using a flag to indicate whether you are currently "inside" a word. And reset that flag whenever you decide that you are no longer inside a word. eg
int inside = 0;
for (int i = 0; i < size; i++) {
if (alnum(a[i])) {
if (!inside) {
inside = 1;
j++;
}
} else {
inside = 0;
}
}
Also, please use strlen(a) instead of sizeof(a)/sizeof(char). If you continue that practice, you're bound to have an accident one day when you try it on a pointer.
This loop is invalid
for(int i = 0; i < size; i++){
if(isspace(i) and isalnum(i - 1) and isalnum(i + 1))
First of all you does not check characters of the string whether they are spaces or alphanumeric. You check variable i whicj has nothing common with the content of the string. Moreover you have an intention to access memory beyond the array
As you are dealing with a string I would declare the function the following way
size_t countwords( const char *s );
It could be defined as
size_t countwords( const char *s )
{
size_t count = 0;
while ( *s )
{
while ( isspace( *s ) ++s;
if ( *s ) ++count;
wjile ( isalnum( *s ) ++s;
}
return ( count );
}
I do not take into account punctuation symbols. Otherwise you should substitute isspace for !isalnum.
A simpler version would be to repeatedly call strtok() on the string, and each time that an element is returned, you can increment a word count. This would take care of doubled spaces, and so on. You could even split two words with a comma but no space ("this,error") without difficulty.
something along the lines of:
do {
s = strtok(s," ,.;");
if (s) wordcount++;
} while(s);
The only immediate disadvantage is that strtok is destructive, so make a copy before starting.
To count the number of words, you merely need to count the number of times you see a non-whitespace character after a whitespace character. To get things right at the start of the string, assume there is "whitespace" to the left of the string.
int countwords(char* a, int size) {
bool prev_ws = true; // pretend like there's whitespace to the left of a[]
int words = 0;
for (int i = 0; i < size; i++) {
// Is the current character whitespace?
bool curr_ws = isspace( (unsigned char)a[i] );
// If the current character is not whitespace,
// but the previous was, it's the start of a word.
if (prev_ws && !curr_ws)
words++;
// Remember whether the current character was
// whitespace for the next iteration.
prev_ws = curr_ws;
}
return words;
}
You might also notice I included a cast to unsigned char on the call to isspace(). On some platforms, char defaults to signed, but the classifier functions isspace and friends aren't guaranteed to work with negative values. The cast forces all the values to be positive. (More details: http://en.cppreference.com/w/cpp/string/byte/isspace )

Check to see if an element exists in a 2D array C++

I am trying to check to see if an character element is in my output array. The array is getting the frequency of the characters in a string. So i want to say if the current character is in the array then add 1 to the frequency else add the character to the array with a frequency of 1. Also, I want the table to display the top 5 highest frequency's in order.
EX of what the table should look like:
character: a b c d
freqency: 1 2 3 4
string input = GetInputString(inputFileName);
char ** output;
for (int i = 0; i < sizeof(output); i++)
{
if (input [count] == output[i][]) // this is where my issue is
{
//.......
}
}
You could use std::vector<std::pair<char,int>> to store character and it's count.
string input("1212345678999");
std::vector<std::pair<char, int>> sp;
for(auto c : input)
{
auto it = std::find_if(sp.begin(), sp.end(),
[=](const pair<int, char>& p) {return p.first == c; });
if (it != sp.end())
{
it->second++; // if char is found, increase count
}
else
{
sp.push_back(std::make_pair(c, 1)); // new char, add an entry and initialize count to 1
}
}
To display the top 5 highest frequency's in order, you could sort by count in decent order:
std::sort(sp.begin(), sp.end(),
[](const pair<int, char>& p1, const pair<int, char>& p2)
{
return p1.second > p2.second;
});
Assuming your example means that 'a' is at 0,0, 'b' is at 0,2, 1 is at 1,0 etc, which means that the character is always in the first row, you just have to iterate through every entry of 0[x].
// you should declare your array as an array
char output[2][26] = {0}; // {0} initialises all elements;
// ... assign values to output.
// I assume there's a count loop here, that checks for the upper bounds of input.
// ...
// You have to determine how many columns there are somehow,
// I just made a static array of 2,26
const int columnsize = 26;
for (int i = 0; i < columnsize; i++)
{
if ( input[count] == output[0][i] )
{
// found the character
}
}
This is to make your implementation work, but there are better or at least easier ways to do this. For instance, if your array sizes aren't fixed at compile time, you could use a vector of vectors. Or if you just want to track the occurrences of characters, you could use a stl map of characters to frequency.

how to search properly the vector for a value

The problem I have, I have to add to a vector, the missing chars.
For example I have initially
s,a,p,i,e,n,t,i,a
and I have to add missing chars to it
s,a,p,i,e,n,t,i,a,b,c,d ...
I am trying to use this code to search for an existing value.
for(char c='a';c!='z';++c)
{
if (vec.end()!=find(vec.begin(),vec.end(),c))
vec.push_back(c);
}
The find returns last when it fails to locate a value. But how do I know if last value was in it?
EDIT
When the for loop starts, for 'a' returns vec.end() so it should not go in, but goes in, and adds 'a' again in the end.
See this in debugger
alt text http://img203.imageshack.us/img203/2048/bb1f.jpg
(The bug I have, the value in last position gets inserted twice, I have to omit this)
In your case it's best to:
Create one vector(bool), with indexes from 'a' to 'z' ,initialize it to false, (i)
Run once through your original vector, set true in the other vector the element for each character that you find,
Run once through this new vector and for each false value, append the corresponding char to the original.
(i) You may use actual_index = character - 'a'. Put some assertions here and there so that you don't crash for characters outside the range you are checking, presumably 'a' to 'z' (which by the way is not a strict definition of what a char is).
With just one initialization, two steps linear steps and no searches, you'll be done.
What others have answered is true but you should also change the termination condition in your for loop to c <= 'z' if you want the letter z to be included in your list.
EDIT
I can't help adding that with the Boost.RangeEx library your problem can be solved with a one-liner:
boost::set_difference(boost::counting_range('a', char('z' + 1)),
std::set<char>(vec.begin(), vec.end()),
std::back_inserter(vec));
Nope, end() is not the last element of the vector but past it. To iterate over all elements you normally do
for(it= vec.begin(); it!= vec.end(); it++) ...
So whatever your problem is, this is ok.
When find succeeds it returns iterator which is pointing to the found position. So if the value is in the vector then the return value will be something other that vec.end(). The condition inside the if condition should be == and not != if you are trying to create vector of unique characters.
If you need to find a value in your container, then the greatest likelihood is that you need to use a different sort of container, where searching is fast!
Have a Look at this very useful diagram by Adrinael:
(source: adrinael.net)
(In your case I believe std::set is probably the most appropriate)
vec.end() returns an iterator whose position is past the last element in the vector. If it matches on the last value, the returned iterator will not be equal to vec.end().
You want to insert the character into array if it is NOT found, when it == end.
You are checking if it != end so you insert characters when they are already found in the string. See this:
char v[] = "acef";
vector<char> str(v,v+sizeof(v));
copy(str.begin(), str.end(), ostream_iterator<char>(cout, ","));
cout << endl;
for (char c = 'a'; c < 'g'; ++c)
{
vector<char>::iterator it = find(str.begin(),str.end(), c);
if (it == str.end())
{
str.push_back(c);
}
}
copy(str.begin(), str.end(), ostream_iterator<char>(cout, ","));
output:
a,c,e,f,,
a,c,e,f,,b,d,
The extra empty character ,, is the null in the original string "acef" - null terminated.
you can start by sorting the vector
in this way you will notice instantly for gaps in the sequence
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;
int main(int argc, char** argv)
{
vector<char> original;
original.push_back('a');
original.push_back('d');
original.push_back('x');
original.push_back('j');
original.push_back('z');
sort(original.begin(), original.end());
vector<char> unseen_chars;
char current_char = 0;
char last_char = original[0];
for (int i = 1; i <= original.size(); i++)
{
current_char = original[i];
for ( char j = last_char + 1; j < current_char; j++)
{
unseen_chars.push_back(j);
}
last_char = current_char;
}
for (int i = 0; i < unseen_chars.size(); i++)
{
cout << unseen_chars[i];
}
cout << endl;