C++ Character Frequency Linked List - c++

For my project, I need to read in a file and count the times each character appears and store it in a linked list. Below is what I have for the reading in the file portion of the program:
ifstream inFile;
ofstream outFile;
inFile.open(inputfile.txt);
char ch;
list<charFrequency> charFreqList;
list<charFrequency>::iterator i;
inFile >> ch;
while (!inFile.eof())
{
charFrequency cf(ch);
charFreqList.push_back(cf);
for (i = charFreqList.begin(); i != charFreqList.end(); ++i)
{
if (i->getCharacter() == cf.getCharacter())
{
i->increment();
charFreqList.pop_back();
}
}
inFile >> ch;
}
inFile.close();
I need the program to go through and if the character is already in the linked list, it just needs to increment the count of the character but only leave one instance of the character in the list, however, I get an error message stating "list iterator not implementable". I know it has to do with the pop_back() as it removes the last element, but I don't know about avoiding this issue.
Thanks in advance for the help!

First, only add a charFrequency if the character doesn't exist, otherwise you increment the count. It would also be easier if a new charFrequency started with a count of 1.
Second, change your input loop to not check for eof(). This is explained in many threads on SO.
class charFrequency
{
int count;
char ch;
public:
void increment() { ++count; }
char getCharacter() const { return ch; }
int getCount() const { return count; }
charFrequency(char c) : ch(c), count(1) {}
};
Last, for kicks, let's use some C++ and use the std::find_if() algorithm function instead of writing a loop. Introduce a function object below that finds a character.
struct FindCharacter
{
char ch;
FindCharacter(char c) : ch(c) {}
bool operator()(charFrequency& cf) const
{ return cf.getCharacter() == ch; }
};
So now putting this all together, we have this:
#include <list>
#include <algorithm>
#include <fstream>
//...
while (ifs)
{
ifs >> ch;
// Search for character
std::list<charFrequency>::iterator it = std::find_if(charFreqList.begin(), charFreqList.end(), FindCharacter(ch));
// if not found, add new charFrequency to list
if ( it == charFreqList.end())
charFreqList.push_back(charFrequency(ch));
else
it->increment(); // increment
}
//...

You don't need to push into the list and remove it back. You could do something like below, or break from the if condition in your code and then remove the last element (not in the loop)
list<charFrequency> charFreqList;
list<charFrequency>::iterator i;
while (!inFile.eof())
{
charFrequency cf(ch);
Predicate pre(ch); //Write your predicate functor
//check if the char is present
i = std::find_if(charFreqList.begin(), charFreqList.end(), pred);
if( i != charFreqList.end() )
i->increment();
else
charFreqList.push_back(ch); //insert only when not present
inFile >> ch;
}
However, std::list is a wrong choice of datastructure here, it shall give a bad run time complexity.

Related

c++ How to read from a file into array one word at a time

I know this is a dumb question!
But I just CAN NOT get my head around how to read my file into an array one word at a time using c++
Here is the code for what I was trying to do - with some attempted output.
void readFile()
{
int const maxNumWords = 256;
int const maxNumLetters = 32 + 1;
int countWords = 0;
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
string word;
while (fin >> word)
{
countWords++;
assert (countWords <= maxNumWords);
}
char listOfWords[countWords][maxNumLetters];
for (int i = 0; i <= countWords; i++)
{
while (fin >> listOfWords[i]) //<<< THIS is what I think I need to change
//buggered If I can figure out from the book what to
{
// THIS is where I want to perform some manipulations -
// BUT running the code never enters here (and I thought it would)
cout << listOfWords[i];
}
}
}
I am trying to get each word (defined by a space between words) from the madLib.txt file into the listOfWords array so that I can then perform some character by character string manipulation.
Clearly I can read from a file and get that into a string variable - BUT that's not the assignment (Yes this is for a coding class at college)
I have read from a file to get integers into an array - but I can't quite see how to apply that here...
The simplest solution I can imagine to do this is:
void readFile()
{
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
vector<string> listOfWords;
std::copy(std::istream_iterator<string>(fin), std::istream_iterator<string>()
, std::back_inserter(listOfWords));
}
Anyways, you stated in your question you want to read one word at a time and apply manipulations. Thus you can do the following:
void readFile()
{
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
vector<string> listOfWords;
string word;
while(fin >> word) {
// THIS is where I want to perform some manipulations
// ...
listOfWords.push_back(word);
}
}
On the suggestion of πάντα ῥεῖ
I've tried this:
void readFile()
{
int const maxNumWords = 256;
int const maxNumLetters = 32 + 1;
int countWords = 0;
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
string word;
while (fin >> word)
{
countWords++;
assert (countWords <= maxNumWords);
}
fin.clear();
fin.seekg(0);
char listOfWords[countWords][maxNumLetters];
for (int i = 0; i <= countWords; i++)
{
while (fin >> listOfWords[i]) //<<< THIS did NOT need changing
{
// THIS is where I want to perform some manipulations -
cout << listOfWords[i];
}
}
and it has worked for me. I do think using vectors is more elegant, and so have accepted that answer.
The suggestion was also made to post this as a self answer rather than as an edit - which I kind of agree is sensible so I've gone ahead and done so.
The most simple way to do that is using the STL algorithm... Here is an example:
#include <iostream>
#include <iomanip>
#include <iterator>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<string> words;
auto beginStream = istream_iterator<string>{cin};
auto eos = istream_iterator<string>{};
copy(beginStream, eos, back_inserter(words));
// print the content of words to standard output
copy(begin(words), end(words), ostream_iterator<string>{cout, "\n"});
}
Instead of cin of course, you can use any istream object (like file)

C++ Using Method on Vector (If is word)

So I have a couple methods defined, one checks if the next is a word, one checks if there is a next word.
I take inputs for the text files, add the files content to a vector, then use the for loop
I can make it work by simply doing a for loop through the vector:
for (int x = 0; x != v2.size(); ++x) {
But I want to use my two methods
bool ReadWords::isNextWord()
and
string ReadWords::getNextWord()
{
//take the input of the file until end of list is reached and return it
}
So how would I do something like
vector<string> bigTextFile;
vector<string> vct;
while(vct.isNextWord) {
vct.getNextWord
if(vct.getNextWord == bigTextFile[i] {
counter++
}
}
Let me know if you need any more of the code
What about this:
vector<string> bigTextFile;
vector<string> vct;
while(ReadWords::isNextWord(vct))
{
string nextWord = ReadWords::getNextWord(vct);
if(nextWord == bigTextFile[i])
{
counter++;
}
}
You could make your functions take in a vector as a parameter then do the work on the vector inside each function. I'm not 100% sure what your implementation is since its not the complete code.
I'm not sure why are you trying to write an object, but here is how I would have approached the problem:
Step 1. Create a predicate for the words
bool is_needed_word(const std::string& word, const std::vector<std::string>& needed_words)
{
if (std::find(needed_words.begin(),
needed_words.end(),
word) != needed_words.end())
return true;
}
//some where in your function
std::vector<std::string> needed_words{"apple", "moon", "etc"};
auto predicate = [&needed_words](const auto& word)
{return is_needed_word(word, needed_words);}
Step 2. Copy/count/whatever you want to do with the words in the stream
//copying
std::vector<std::string> extracted_words;
std::ifstream input_file{"input.txt"};
//check if opened correctly
std::copy_if(std::istream_iterator<std::string>(input_file), {},
std::back_inserter(extracted_words));
//counting
std::size_t occurrence_counter = std::count_if(std::istream_iterator<std::string>(input_file),{},predicate)

Must Return a Value; Vector of Integers; Word Count

I'm having some problems with a program that I'm writing where you find the count of words in a file. The issue I'm having is with my function "get_word_counts" in my cpp. I'm continually getting an message in Visual Studio that states "error C4716: 'TextCounter::get_word_counts': must return a value", despite the fact that I do, in fact, return a value at the end of the function.
Can someone help me understand what the issue with this is? I have searched everywhere but I can't seem to figure out exactly what the problem is. Perhaps it's something simple, but I just can't see it.
I'll post below my cpp file as well as the header:
cpp:
#include"TextCounter.h"
#include <string>
#include <iostream>
#include <vector>
/*
Constructor takes in the filename and builds the map.
*/
TextCounter::TextCounter(std::string file):
filename(file) {
// Build the input list.
parse_file();
}
/*
Get the count of each word in the document.
If a word doesn't occur in the document, put 0.
*/
std::vector<int> TextCounter::get_word_counts(const std::vector<std::string>& words) {
// TODO: Finish this method.
std::vector<int> result;
std::unordered_map<std::string, int>::iterator iter;
for (auto const &i : words) {
iter = TextCounter::frequency.find(i);
if (iter == frequency.end()) {
result.push_back(0);
}
else {
result.push_back(iter->second);
}
}
return result;
}
// Add a word to the map.
// Check to see if the word exists, if so, increment
// otherwise create a new entry and set it to 1.
void TextCounter::add_word(std::string word) {
// COMP: finished this method.
//Look if it's already there.
if (frequency.find(word) == frequency.end()) // Then we've encountered the word for a first time.
frequency[word] = 1; // Initialize it to 1.
else // Then we've already seen it before..
frequency[word]++; // Increment it.
}
// Parse an input file.
// Return -1 if there is an error.
int TextCounter::parse_file() {
// Local variables.
std::ifstream inputfile; // ifstream for reading from input file.
// Open the filename specified for input.
inputfile.open (filename);
// Tokenize the input.
// Read one character at a time.
// If the character is not in a-z or A-Z, terminate current string.
char c;
char curr_str[MAX_STRING_LEN];
int str_i = 0; // Index into curr_str.
bool flush_it = false; // Whether we have a complete string to flush.
while (inputfile.good()) {
// Read one character, convert it to lowercase.
inputfile.get(c);
c = tolower(c);
if (c >= 'a' && c <= 'z') {
// c is a letter.
curr_str[str_i] = c;
str_i++;
// Check over-length string.
if (str_i >= MAX_STRING_LEN) {
flush_it = true;
}
} else {
// c is not a letter.
// Create a new string if curr_str is non-empty.
if (str_i>0) {
flush_it = true;
}
}
if (flush_it) {
// Create the new string from curr_str.
std::string the_str(curr_str,str_i);
// std::cout << the_str << std::endl;
// COMP: Insert code to handle new entries or increment an existing entry.
TextCounter::add_word(the_str);
// Reset state variables.
str_i = 0;
flush_it = false;
}
}
// Close input file.
inputfile.close();
return 0;
}
header:
#include <fstream>
#include <unordered_map>
#include <string>
#include <vector>
#define MAX_STRING_LEN 256
#define DICT_SIZE 20000
class TextCounter {
public:
explicit TextCounter(std::string file = "");
// Get the counts of a vector of words.
std::vector<int> get_word_counts(const std::vector<std::string>& words);
private:
// Name of the input file.
std::string filename;
// COMP: Implement a data structure to keep track of each word and the
// number of times that word occurs in the document.
std::unordered_map<std::string, int> frequency;
// Parse an input file.
int parse_file();
// Add a word to the map.
void add_word(std::string word);
};
Any help would be greatly appreciated.
Thank you!

Pseudo-istream pointer return

I've been going through Stroustrup's Programming and Principles to teach myself c++11.
In chapter 11, he describes a program that removes (turns into whitespace) any un-wanted characters from an input stream. So, for example, I could set a string to hold the characters '!' and '.'. And then I could input
dog! food and receive the output dog food .
However, I'm not understanding how the string, word in main
int main ()
{
Punct_stream ps {cin};
ps.whitespace(";:,.?!()\"{}<>/&$##%^*|~");
ps.case_sensitive(false);
cout<<"Please input words."<<"\n";
vector<string> vs;
for (string word; ps>>word;)// how does word get assigned a string? {
vs.push_back(word);
}
sort(vs.begin(), vs.end());
for (int i = 0; i<vs.size(); ++i) {
if (i==0 || vs[i]!=vs[i-1]) cout<<vs[i]<<"\n";
}
}
is assigned a value through the overloaded definition of >>.
Punct_stream& Punct_stream::operator>>(string& s)
{
while (!(buffer>>s)) {
if (buffer.bad() || !source.good()) return *this;
buffer.clear();
string line;
getline(source,line); // get a line from source
for (char& ch : line)
if (is_whitespace(ch))
ch = ' ';
else if (!sensitive)
ch = tolower(ch);
buffer.str(line); //how does word become this value?
}
return *this;
}
Obviously, pointer this will be the result of >>, but I don't understand how that result includes assigning word the string of istringstream buffer. I only know the basics of pointers, so maybe that's my problem?
#include<iostream>
#include<sstream>
#include<string>
#include<vector>
using namespace std;
class Punct_stream {
public:
Punct_stream(istream& is)
: source{is}, sensitive{true} { }
void whitespace(const string& s) { white = s; }
void add_white(char c) { white += c; }
bool is_whitespace(char c);
void case_sensitive(bool b) { sensitive = b; }
bool is_case_sensitive() { return sensitive; }
Punct_stream& operator>>(string& s);
operator bool();
private:
istream& source;
istringstream buffer;
string white;
bool sensitive;
};
Punct_stream& Punct_stream::operator>>(string& s)
{
while (!(buffer>>s)) {
if (buffer.bad() || !source.good()) return *this;
buffer.clear();
string line;
getline(source,line); // get a line from source
for (char& ch : line)
if (is_whitespace(ch))
ch = ' ';
else if (!sensitive)
ch = tolower(ch);
buffer.str(line); //how does word become this value?
}
return *this;
}
Punct_stream::operator bool()
{
return !(source.fail() || source.bad()) && source.good(); }
bool Punct_stream::is_whitespace(char c) {
for (char w : white)
if (c==w) return true; return false;
}
int main ()
{
Punct_stream ps {cin};
ps.whitespace(";:,.?!()\"{}<>/&$##%^*|~");
ps.case_sensitive(false);
cout<<"Please input words."<<"\n";
vector<string> vs;
for (string word; ps>>word;)// how does word get assigned a string? {
vs.push_back(word);
}
sort(vs.begin(), vs.end());
for (int i = 0; i<vs.size(); ++i) {
if (i==0 || vs[i]!=vs[i-1]) cout<<vs[i]<<"\n";
}
}
The trick is that the while loop inside operator >> has opposite logic to what you normally do when reading from a stream. Normally, you'd do something like this (and main does it, in fact):
while (stream >> aString)
Notice, however, that the while in the extractor has a negation:
Try extracting s from buffer. If you fail, do one iteration of the loop and try again.
At start, buffer is empty so extracting s will fail and the loop body will be entered. What the loop body does is read a line from source (the stream being wrapped), transform selected characters of that line into whitespace, and set this line as the content of buffer (via the buffer.str(line); call).
So, after the line was transformed, it is queued into buffer. Then the next iteration of the loop comes, and it again tries to extract s from buffer. If the line had any non-whitespace, the first word will be extracted (and the rest will remain in buffer for further readings). If the line had whitespace only, the loop body is entered again.
Once s is successfully extracted, the loop terminates and the function exits.
On next call, it will work with whatever was left in buffer, re-filling buffer from source as necessary (by the process I've explained above).

C++ Counting words in a file between two words

I am currently trying to count the number of words in a file. After this, I plan to make it count the words between two words in the file. For example. My file may contain. "Hello my name is James". I want to count the words, so 5. And then I would like to count the number of words between "Hello" and "James", so the answer would be 3. I am having trouble with accomplishing both tasks.
Mainly due to not being exactly sure how to structure my code.
Any help on here would be greatly appreciated. The code I am currently using is using spaces to count the words.
Here is my code:
readwords.cpp
string ReadWords::getNextWord()
{
bool pWord = false;
char c;
while((c = wordfile.get()) !=EOF)
{
if (!(isspace(c)))
{
nextword.append(1, c);
}
return nextword;
}
}
bool ReadWords::isNextWord()
{
if(!wordfile.eof())
{
return true;
}
else
{
return false;
}
}
main.cpp
main()
{
int count = 0;
ReadWords rw("hamlet.txt");
while(rw.isNextWord()){
rw.getNextWord();
count++;
}
cout << count;
rw.close();
}
What it does at the moment is counts the number of characters. I'm sure its just a simple fix and something silly that I'm missing. But I've been trying for long enough to go searching for some help.
Any help is greatly appreciated. :)
Rather than parse the file character-by-character, you can simply use istream::operator<<() to read whitespace-separated words. << returns the stream, which evaluates to true as a bool when the stream can still be read from.
vector<string> words;
string word;
while (wordfile >> word)
words.push_back(word);
There is a common formulation of this using the <iterator> and <algorithm> utilities, which is more verbose, but can be composed with other iterator algorithms:
istream_iterator<string> input(wordfile), end;
copy(input, end, back_inserter(words));
Then you have the number of words and can do with them whatever you like:
words.size()
If you want to find "Hello" and "James", use find() from the <algorithm> header to get iterators to their positions:
// Find "Hello" anywhere in 'words'.
const auto hello = find(words.begin(), words.end(), "Hello");
// Find "James" anywhere after 'hello' in 'words'.
const auto james = find(hello, words.end(), "James");
If they’re not in the vector, find() will return words.end(); ignoring error checking for the purpose of illustration, you can count the number of words between them by taking their difference, adjusting for the inclusion of "Hello" in the range:
const auto count = james - (hello + 1);
You can use operator-() here because std::vector::iterator is a “random-access iterator”. More generally, you could use std::distance() from <iterator>:
const auto count = distance(hello, james) - 1;
Which has the advantage of being more descriptive of what you’re actually doing. Also, for future reference, this kind of code:
bool f() {
if (x) {
return true;
} else {
return false;
}
}
Can be simplified to just:
bool f() {
return x;
}
Since x is already being converted to bool for the if.
To count:
std::ifstream infile("hamlet.txt");
std::size_t count = 0;
for (std::string word; infile >> word; ++count) { }
To count only between start and stop:
std::ifstream infile("hamlet.txt");
std::size_t count = 0;
bool active = false;
for (std::string word; infile >> word; )
{
if (!active && word == "Hello") { active = true; }
if (!active) continue;
if (word == "James") break;
++count;
}
I think "return nextword;" should instead be "else return nextword;" or else you are returning from the function getNextWord every time, no matter what the char is.
string ReadWords::getNextWord()
{
bool pWord = false;
char c;
while((c = wordfile.get()) !=EOF)
{
if (!(isspace(c)))
{
nextword.append(1, c);
}
else return nextword;//only returns on a space
}
}
To count all words:
std::ifstream f("hamlet.txt");
std::cout << std::distance (std::istream_iterator<std::string>(f),
std::istream_iterator<std::string>()) << '\n';
To count between two words:
std::ifstream f("hamlet.txt");
std::istream_iterator<std::string> it(f), end;
int count = 0;
while (std::find(it, end, "Hello") != end)
while (++it != end && *it != "James")
++count;
std::cout << count;
Try this:
below the line
nextword.append(1, c);
add
continue;