I am very new to C++ and have a program which I included below.
The program I am working on reads text from an input file and counts the number of words and number of occurrences of each letter in the text and then prints the results. My program is working fine but the problem is all code is written in the main function and I need to break it up into a couple more functions to make the program modular, but I am unsure of how to go about doing this.
I am sure this is pretty simple but I'm not sure where to start. I was thinking of implementing two void functions, one for reading / interpreting what is read from the data file and another that displays the results; and then call them both in the main function, but I'm not sure what to take as arguments for those functions.
int main()
{
// Declaring variables
char c; // char that will store letters of alphabet found in the data file
int count[26] = {0}; // array that will store the # of occurences of each letter
int words = 1; // int that will store the # of words
string s; // declaring string found in data file
// Opening input file stream
ifstream in;
in.open("word_data.txt");
// Reading text from the data file
getline(in, s);
//cout << s << endl;
// If input file fails to open, displays an error message
if (in.fail())
{
cout << "Input file did not open correctly" << endl;
}
// For loop for interpreting what is read from the data file
for (int i = 0; i < s.length(); i++) {
// Increment word count if new line or space is found
if (s[i] == ' ' || s[i] == '\n')
words++;
//If upper case letter is found, convert to lower case.
if (s[i] >= 'A' && s[i] <= 'Z')
s[i] = (tolower(s[i]));
//If the letters are found, increment the counter for each letter.
if (s[i] >= 'a' && s[i] <= 'z')
count[s[i] - 97]++;
}
// Display the words count
cout << words << " words" << endl;
// Display the count of each letter
for (int i = 0; i < 26; i++) {
if (count[i] != 0) {
c = i + 97;
cout << count[i] << " " << c << endl;
}
}
// Always close opened files
in.close();
return 0;
}
I would rewrite it like:
class FileReader {
public:
FileReader() {
// Any init logic goes here...
}
~FileReader() {
// Always close opened files.
in.close();
}
void open(std::string &filePath) {
in.open(filePath);
}
std::string readLine() {
std::string s;
getline(in, s);
return s;
}
bool hasErrors() const { // remove const if you get compile-error here.
return in.fail();
}
private:
ifstream in;
};
class LetterCounter {
public:
void process(std::string &s) {
// For loop for interpreting what is read from the data file
for (int i = 0; i < s.length(); i++) {
// Increment word count if new line or space is found
if (s[i] == ' ' || s[i] == '\n')
words++;
//If upper case letter is found, convert to lower case.
if (s[i] >= 'A' && s[i] <= 'Z')
s[i] = (tolower(s[i]));
//If the letters are found, increment the counter for each letter.
if (s[i] >= 'a' && s[i] <= 'z')
count[s[i] - 97]++;
}
}
void logResult() {
char c; // char that will store letters of alphabet found in the data file.
// Display the words count
cout << words << " words" << endl;
// Display the count of each letter
for (int i = 0; i < 26; i++) {
if (count[i] != 0) {
c = i + 97;
cout << count[i] << " " << c << endl;
}
}
}
private:
int count[26] = {0}; // array that will store the # of occurences of each letter
int words = 1; // int that will store the # of words
};
int main()
{
// Opening input file stream.
FileReader reader;
reader.open("word_data.txt");
// Reading text from the data file.
std::string s = reader.readLine();
// If input file fails to open, displays an error message
if (reader.hasErrors()) {
cout << "Input file did not open correctly" << endl;
return -1;
}
LetterCounter counter;
counter.process(s);
// Display word and letter count.
counter.logResult();
return 0;
}
Note that I did write without testing (excuse any mistake),
but this should give you a general idea how it should be.
Related
Not sure how to phrase the question, but I'm making a program for an assignment, which we're not allowed to use pre-existing libraries besides input/output. We also can only use primitive data-types. I have to read a text file with words, remove all punctuation from the word, and then store those words in a 2D array of characters.
This problem seems to be that when a word starts with a non-alphabetic character, the whole word doesn't output when using cout << stack[top] but when I output each individual character with cout << stack[top][i], it produces the expected output.
'stack' is a 2D array which contains characters to make up words.
'top' is a variable to represent the length of stack
Code:
#include <iostream>
#include <fstream>
using namespace std;
// Function Prototypes
void push(char word[]);
char formatCharacter(char letter);
bool isAlphabet(char letter);
char toLowercase(char letter);
// Global Variables
const int STACK_SIZE = 50000;
const int WORD_SIZE = 30;
char stack[STACK_SIZE][WORD_SIZE];
int top = 0;
int words = 0;
int wordCount[STACK_SIZE];
int main(){
// Local Variables
char filename[20];
ifstream fin;
char word[WORD_SIZE];
// Get file input
cerr << "Please enter the name of the input file: ";
cin >> filename;
// Open file
fin.open(filename);
// Print error if file doesn't open, then quit the program.
if (!fin) {
cerr << "Error opening file " << filename << ". Program will exit." << endl;
return 0;
}
// Read the file into the stack
while (fin >> word) {
push(word);
}
// Close file
fin.close();
}
void push(char word[]){
if (top == STACK_SIZE) return;
int i = 0;
int j = 0;
do {
if (isAlphabet(word[i])){
word[i] = formatCharacter(word[i]);
stack[top][i] = word[i];
cout << stack[top][i]; // Output fine
j++;
}
i++;
} while (word[i]);
wordCount[words] = j;
//cout << stack[top] << ": " << wordCount[words] << endl; // Output incorrect
cout << endl;
top++;
words++;
return;
}
bool isAlphabet(char letter){
if ((letter < 'A' || letter > 'Z') && (letter < 'a' || letter > 'z')){
return false;
}
else{
return true;
}
}
char formatCharacter(char letter){
if ((letter < 'A' || letter > 'Z') && (letter < 'a' || letter > 'z')){
letter = '\0';
}
else{
if (letter >= 'A' && letter <= 'Z'){
letter = toLowercase(letter);
}
}
return letter;
}
char toLowercase(char letter){
letter = letter + 32;
return letter;
}
isAlphabet() just checks if it's an alphabetic character
formatCharacter() removes any punctuation by replacing the character with '\0', and also changes uppercase to lowercase.
Input:
Jabberwocky
'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
Output when using cout << stack[top][i]:
jabberwocky
twas
brillig
and
the
slithy
toves
did
gyre
and
gimble
in
the
wabe
all
mimsy
were
the
borogoves
and
the
mome
raths
outgrabe
Output when using cout << stack[top]:
jabberwocky: 11
: 4
brillig: 7
and: 3
the: 3
slithy: 6
toves: 5
did: 3
gyre: 4
and: 3
gimble: 6
in: 2
the: 3
wabe: 4
all: 3
mimsy: 5
were: 4
the: 3
borogoves: 9
and: 3
the: 3
mome: 4
raths: 5
outgrabe: 8
Notice the word 'twas' is missing. I'd rather not loop through each character of each word to get the output I need. I'd appreciate any advice, thanks!
The simplest fix is to change:
stack[top][i] = word[i];
To:
stack[top][j] = word[i];
^ j not i here
This will ensure that the 'Twas ends up as twas and not \0twas.
Also, formatCharacter() should call isAlphabet() rather than repeat the condition.
I wrote this program for an intro to C++ course. My issue is that unexpected values are being stored in memory. I assume it has to do with input.getline() or the way certain characters are stored, but I don't know enough about what is happening "under the hood" to fix it.
Specifically, certain characters like apostrophes and quotation marks appear to not read as their hex ASCII counterparts.
I'm pretty certain the issue lies in the lines
input.getline(raw_paragraph, MAX_PARAGRAPH_CHARS);
charCount = strlen(raw_paragraph);
Below I've included the complete code, a screenshot of the Memory from Visual Studio 2022, the test case, and the program output .
Thank you in advance!
#pragma warning(disable : 4996) //DEV
/**************************************************************************************
Header Content
**************************************************************************************/
// Includes and namespaces ------------------------------------------------------------
#include <cstdlib> // Defines functions such as exit().
#include <cstring> // Defines functions such as strcmp, etc.
#include <fstream> // Supports file I/O
#include <iostream> // Supports terminal I/O
using namespace std;
// Constants Declared -----------------------------------------------------------------
// Maximum allowable space for input / Defines space for memory allocation
const int MAX_WORD_CHARS = 50; // Longest word = 50 chars
const int MAX_WORDS = 1000; // Longest paragraph = 1000 words
const int MAX_PARAGRAPH_CHARS = 50000; // 50 * 1000
// "to be" Semantics
const char TO[] = "to";
const char BE[] = "be";
const int NUM_TO_BE_VERBS = 5; // Qty of "to be verbs below
const char TO_BE_VERBS[NUM_TO_BE_VERBS][MAX_WORD_CHARS] =
{ "am", "are", "is", "was", "were" };
// Conjunctions
const int NUM_CONJUNCTIONS = 7; // Qty objects in CONJUNCTIONS below.
const char CONJUNCTIONS[NUM_CONJUNCTIONS][MAX_WORD_CHARS] =
{ "for", "and", "nor", "but", "or", "yet", "so" };
// Punctuation
const int NUM_PUNCTUATIONS = 4;
const char PUNCTUATIONS[NUM_PUNCTUATIONS] = { '.', ',', '?', '!' };
// Functions Declared -----------------------------------------------------------------
int countComplex(char a[][MAX_WORD_CHARS], int b);
int countSentences(char a[], int b);
int count_to_be_verbs(char a[][MAX_WORD_CHARS], int wc);
void init_array(char* a);
void modify_tokens(char a[][MAX_WORD_CHARS], int wc);
int tokenizeParagraph(char p[], char tp[][MAX_WORD_CHARS]);
/**************************************************************************************
Begin Main
**************************************************************************************/
int main()
{
// Format Output
cout.setf(ios::fixed);
cout.setf(ios::showpoint);
cout.precision(1);
// Create input space for user's file request
char filename[256]; // Stores the user defined filename containing plaintext
init_array(filename);
char raw_paragraph[MAX_PARAGRAPH_CHARS]; // Stores plaintext from filename
init_array(raw_paragraph);
// Declare Variables
int charCount = 0; // Number of chars contained in input file except eof.
int complex_count; // Number of complex sentences
int sentenceCount = 0; // Total number of sentences in input
int simpleSent = 0; // Number of simple sentences in input
int to_be_count; // Number of instances of "to be" verbs in input.
int wordCount = 0; // Number of words in input
double averageWordsPerSentence;
// Asks the user for the name of an input file which contains a paragraph
cout << "Enter a filename: ";
cin.getline(filename, 256);
// Try to load the file in filename:
ifstream input;
input.open(filename);
// If file does not exist, cout error then exit(1)
if (input.fail())
{
cout << "Input file " << filename << " does not exist." << endl;
cout << "Thank you for using the English Analyzer." << endl;
exit(1);
}
// If file is empty, cout "Input file _____ is empty." Then exit(1)
char c;
input.get(c);
if (input.eof())
{
cout << "File " << filename << " is empty." << endl;
cout << "Thank you for using the English Analyzer." << endl;
exit(1);
}
else
input.putback(c);
// Store plaintext from file to raw_paragraph
input.getline(raw_paragraph, MAX_PARAGRAPH_CHARS);
// Close ifstream input, will not need it again.
input.close();
// Allocate memory for the output of tokenizeParagraph
char tkn_para[MAX_WORDS][MAX_WORD_CHARS];
// Count chars
charCount = strlen(raw_paragraph);
// Tokenize paragraph, count words
wordCount = tokenizeParagraph(raw_paragraph, tkn_para);
// Count Sentences
sentenceCount = countSentences(raw_paragraph, charCount);
// Average words per sentence
averageWordsPerSentence = double(wordCount) / double(sentenceCount);
// Count Complex Sentences
complex_count = countComplex(tkn_para, wordCount);
// Calculate Simple Sentences
simpleSent = sentenceCount - complex_count;
// Count "to be" verbs
modify_tokens(tkn_para, wordCount);
to_be_count = count_to_be_verbs(tkn_para, wordCount);
// Cout results
cout << "Number of Characters: " << charCount << endl;
cout << "Number of words: " << wordCount << endl;
cout << "Number of sentences: " << sentenceCount << endl;
cout << "Average number words in a sentence: " << averageWordsPerSentence << endl;
cout << "Number of simple sentences: " << simpleSent << endl;
cout << "Number of \"to be\" verbs: " << to_be_count << endl;
}
/**************************************************************************************
Function Definitions
**************************************************************************************/
int countComplex(char a[][MAX_WORD_CHARS], int b)
{
// counter will keep the number of complex sentences found.
int counter = 0;
// For each word in tkn_para,
for (int i = 0; i < b; i++)
{
// If a comma is at the end of tkn_m,
int s = strlen(a[i]) -1;
if (a[i][s] == ',')
{
// For each word in CONJUNCTIONS
for (int x = 0; x < NUM_CONJUNCTIONS; x++)
{
// If the words match,
if (strcmp(a[i + 1], CONJUNCTIONS[x]) == 0)
{
// Increment counter
counter++;
// If a word from a has already been matched, there
// is no reason to try to compare it to more items
// from CONJUNCTIONS. Therefore,
break;
}
}
}
}
// After all iteration has been completed:
return(counter);
}
int countSentences(char a[], int b)
{
int counter = 0;
// For each char in a[]
for (int i = 0; i < b; i++)
{
// If a[i] is an end of sentence punctuation,
if (a[i] == '.' || a[i] == '?' || a[i] == '!')
// Increment counter
counter++;
}
return counter;
}
int count_to_be_verbs(char a[][MAX_WORD_CHARS], int wc)
{
int counter = 0;
// For each word in a:
for (int i = 0; i < wc; i++)
// For each word in TO_BE_VERBS:
for (int y = 0; y < NUM_TO_BE_VERBS; y++)
{
// If words match:
if (strcmp(a[i], TO_BE_VERBS[y]) == 0)
counter++;
}
// For loop checks for "to" token followed by "be"
for (int i = 0; i < wc; i++)
if (strcmp(a[i], TO) == 0 && strcmp(a[i + 1], BE) == 0)
counter++;
return(counter);
}
void init_array(char* a)
{
// For every char in a:
for (int i = 0; i < strlen(a); i++)
// Set the value of a[i] to NULL
a[i] = NULL;
}
void modify_tokens(char a[][MAX_WORD_CHARS], int wc)
{
// For each word in a:
for (int i = 0; i < wc; i++)
{
// Does the computation once instead of 4 times below.
int s = strlen(a[i]) -1;
// Converts first char if uppercase, into lowercase
if (int('#') < a[i][0] && a[i][0] < int('['))
a[i][0] = a[i][0] + 32;
// Convert last char, if punctuation mark, into NULL
if (a[i][s] == ',' || a[i][s] == '!' || a[i][s] == '?' || a[i][s] == '.')
a[i][s] = NULL;
}
}
int tokenizeParagraph(char p[], char tp[][MAX_WORD_CHARS])
{
int i = 0;
char* cPtr;
cPtr = strtok(p, " \n\t");
while (cPtr != NULL)
{
strcpy(tp[i], cPtr);
i++;
cPtr = strtok(NULL, " \n\t");
}
return(i);
}
Turns out that the issue has to do with copying the test case into MS Word or another like application. There are character equivalents to apostrophes and quotation marks that "lean" left or right. Those characters are actually distinct and are responsible for the memory values I've been encountering. It suggests to me that a future iteration of the code would have to parse the raw input for those types of characters and replace them.
I'm trying to make a program that uses stacks (w pop, push, etc.) to read in a text file with lots of sentences that are taken one at a time and outputs whether each line is a palindrome or not (words that are spelled the same forwards and backwards). I believe its very close to being a completed program, but it only returns false even when the string is a palindrome. I want it to return true when the string is in fact a palindrome.
EDIT: Tried a new method with three stacks instead. Still getting a false return from bool tf every time.
int main() {
Stack s(100); // Initialize two different stacks
Stack q(100);
Stack temp(100);
string line; // String to hold each individual line of the file
char letter;
char x; // For comparisons
char y;
// open the file
ifstream input;
input.open(READFILE);
// Check that it is open/readable
if (input.fail()) {
cout << endl << "Sorry, file not available, exiting program. Press enter";
cout << endl;
cin.get(); // Grab the enter
return 0;
}
while (getline(input, line)) { // Read the file line-by-line into "line"
cout << "The line: " << line << endl;
int length = line.length(); // Sets length equal to string length
for (int i =0; i<length; i++){ // Capitalizes string
line[i] = toupper(line[i]);
}
for (int i = 0; i < length; i++) { // Loop through for every letter in the line
if (line[i] == ' ' ) {
line.erase(i,1); // Takes spaces out of the line
length--;
}
if (ispunct(line[i])){
length--;
}
if (!ispunct(line[i])){ // Removes punctuation
letter = line[i]; // Push each letter onto the stack
s.push(letter);
}
}
for (int i = 0; i < length; i++) { // Popping half the letters off of the s stack
s.pop(letter); // and pushing them onto the q stack
q.push(letter);
temp.push(letter);
}
for (int i = 0; i < length; i++) {
temp.pop(letter);
s.push(letter);
}
bool tf = true; // Pop off the top of each stack and compare
while (!s.empty()) { // them to check for a palindrome
s.pop(x);
q.pop(y);
if (x == y);
else tf = false;
}
if (tf){
cout << "is a palindrome!" << endl;
}
if (!tf) {
cout << "is NOT a palindrome" << endl;
}
}
}
for (int i = 0; i < length/2; i++) // Popping half the letters off
//of the s stack
q.push(letter); // and pushing them onto the q
//stack
}
Here you're pushing the same letter over and over again.
Even if you rewrite as the comment states it will be wrong.
if you pop half of ABBA you have BA and AB and compare B=A
you need to rethink your strategy. Maybe push half of the string to s then loop backwards from length and push to q
Like someone else mentioned, even after fixing the for loop with the "q" stack, the general strategy is not correct. In fact you don't need two stacks. (or even one stack, but you can use a stack if desired.)
You do have the right idea in comparing the back half of the letters with the front half. In general, to find a palindrome you just need to see if the string is equal to the reversed string or that the first half is equal to the back half.
You can use a stack to store the string in reverse. All you need is the stack and the string. However, there is the extra problem here in that the lines of strings contain spaces and punctuation that you want to ignore. Using the erase() method reduces the length of the string as you go, so you need a temporary variable to rebuild the formatted string at the same time as the stack. EDIT: I saw your update to accounting for the reduced length; that's great -- it can save even the use of a temp variable to hold the formatted string so that the variable string line is all that is needed.
Here is another version of your while loop that uses one stack and a temp string variable. It uses half the formatted string to compare against the top of the stack (which represents the "back" of the string).
string cleanString;
//cout << "test3";
while (getline(input, line)) { // Read the file line-
//by-line into "line"
cout << "The line read was: " << line << endl;
int length = line.length(); // Sets length equal to
//string length
for (int i =0; i<length; i++) // Capitalizes string
line[i] = toupper(line[i]);
for (int i = 0; i < length; i++) // Loop through for //every letter in the line
if ( !(line[i] == ' ' || ispunct(line[i]))) { // Ignore space & punctuation
letter = line[i]; // Push each letter onto
s.push(letter); //the stack
cleanString.push_back(letter); //and to the "cleaned" string to compare with later
//cout << cleanString << endl; //test
}
length = cleanString.length();
bool tf = true;
for (int i = 0; i < length/2; i++) { // Pop off the top of stack
s.pop(x); // to compare with front of string
if ( cleanString[i] != x ) { //not a palindrome
tf = false;
break;
}
}
if (tf){
cout << "is a palindrome!" << endl;
}
if (!tf) {
cout << "is NOT a palindrome" << endl;
}
}
But it's simpler to skip the use of the stack altogether and instead just use the temp "cleaned" string, checking for a palindrome in a for loop with two counters: one for the the front and one for the back.
So after the capitalization:
// Instead of a stack, just build a string of chars to check for a palindrome
for (int i = 0; i < length; i++)
if ( !(line[i] == ' ' || ispunct(line[i]))) {
letter = line[i]; // Push each letter onto
cleanString.push_back(letter); // a temp string
}
length = cleanString.length(); //use length of formatted string
bool tf = true;
int front = 0; // first char of string
int back = length-1; // last char of string
for (; i < length/2; front++, back--)
if ( cleanString[front] != cleanString[back] ) { //not a palindrome
tf = false;
break;
}
Another option is to use the inbuilt reverse() function in the <algorithm> header file after building the temp string:
#include <algorithm> // reverse()
string cleanString;
string reversedCleanString;
//...
// Instead of a stack, just build a string of chars to check for a palindrome
for (int i = 0; i < length; i++)
if ( !(line[i] == ' ' || ispunct(line[i])))
cleanString.push_back(line[i]);
reversedCleanString = cleanString; // store copy of string to reverse
reverse(reversedCleanString.begin(), reversedCleanString.end() ); // reverse
bool tf = true;
if ( cleanString != reversedCleanString)
tf = false;
// ...
As moooeeep mentioned the comments, using std::string's reverse iterators simplifies this even further after the capitalization:
string cleanString;
//...
// Format line to test if palindrome
for (int i = 0; i < length; i++)
if ( !(line[i] == ' ' || ispunct(line[i])))
cleanString.push_back( line[i] );
bool tf = true;
if ( cleanString != string(cleanString.rbegin(), cleanString.rend() )
tf = false;
// ...
Also, like moooeeeep mentioned, encapsulating the different parts of the while loop into their own separate functions is a good idea to make not just debugging easier but also understanding the logical flow of the problem more intuitively.
For example the while loop could look like this:
while (getline(input, line)) { // Read the file line-by-line into "line"
//echo input
cout << "The line read was: " << line << endl;
// validate/format the line
extractChars( line ); //remove space/punctuation
capitalizeString( line ); // capitalize chars for uniformity
//check if formatted line is a palindrome and output result
if ( is_palindrome( line ) )
cout << "Line IS a palindrome << endl;
else
cout << "Line IS NOT a palindrome << endl;
}
This is the question that needs to be implemented:
Write a C++ program that stops reading a line of text when a period is
entered and displays the sentence with correct spacing and capitalization. For this program, correct spacing means only one space between words, and all letters should be lowercase, except the first letter. For example, if the user enters the text "i am going to Go TO THe moVies.", the displayed sentence should be "I am going to go to the movies."
I have written my piece of code which looks like this:
// Processing a sentence and verifying if it is grammatically correct or not (spacing and capitalization)
//#include <stdio.h>
//#include <conio.h>
#include <iostream>
#include <string>
using namespace std;
int main()
{
string sentence;
cout << "Enter the sentence: ";
getline(cin, sentence);
int len = sentence.length();
// Dealing with capitalizations
for (int j = 0; j <= len; j++)
{
if (islower(sentence[0]))
sentence[0] = toupper(sentence[0]);
if(j>0)
if(isupper(sentence[j]))
sentence[j] = tolower(sentence[j]);
}
int space = 0;
do
{
for (int k = 0; k <= len; k++)
{
if(isspace(sentence[k]))
{
cout << k << endl;
int n = k+1;
if(sentence[n] == ' ' && n <=len)
{
space++;
cout << space <<endl;
n++;
cout << n <<endl;
}
if(space!= 0)
sentence.erase(k,space);
cout << sentence <<endl;
}
}
len = sentence.length();
//cout << len <<endl;
} while (space != 0);
}
With this I was able to deal with capitalization issue but problem occurs when I try to check for more than one whitespace between two words. In the do loop I am somehow stuck in an infinite loop.
Like when I try and print the length of the string (len/len1) in the first line inside do-while loop, it keeps on running in an infinite loop. Similarly, when I try and print the value of k after the for loop, it again goes into infinite loop. I think it has to do with my use of do-while loop, but I am not able to get my head around it.
This is the output that I am receiving.
there are a few different issues with this code, but i believe that the code below addresses them. hopefully this code is readable enough that you can learn a few techniques. for example, no need to capitalize the first letter inside the loop, do it once and be done with it.
the usual problem with infinite loops is that the loop termination condition is never met--ensure that it will be met no matter what happens in the loop.
#include <iostream>
#include <string>
using namespace std;
int main() {
string sentence;
cout << "Enter the sentence: ";
getline(cin, sentence);
int len = sentence.find(".", 0) + 1; // up to and including the period
// Dealing with capitalizations
if (islower(sentence[0]))
sentence[0] = toupper(sentence[0]);
for (int j = 1; j < len; j++)
if(isupper(sentence[j]))
sentence[j] = tolower(sentence[j]);
// eliminate duplicate whitespace
for (int i = 0; i < len; i++)
if (isspace(sentence[i]))
// check length first, i + 1 as index could overflow buffer
while (i < len && isspace(sentence[i + 1])) {
sentence.erase(i + 1, 1);
len--; // ensure sentence decreases in length
}
cout << sentence.substr(0, len) << endl;
}
Here goes
std::string sentence;
std::string new_sentence;
std::cout << "Enter the sentence: ";
std::getline(std::cin, sentence);
bool do_write = false; // Looking for first non-space character
bool first_char = true;
// Loop to end of string or .
for (unsiged int i = 0; i < sentence.length() && sentence[i] != '.'; ++i) {
if (sentence[i] != ' ') { // Not space - good - write it
do_write = true;
}
if (do_write) {
new_sentence += (first_char ? toupper(sentence[i]) : tolower(sentence[i]);
first_char = false;
}
if (sentence[i] == ' ') {
do_write = false; // No more spaces please
}
}
if (i < sentence.length()) { // Add dot if required
new_sentence += '.';
}
I am working on a lab for my C++ class. I have a very basic working version of my lab running, however it is not quite how it is supposed to be.
The assignment:
Write a program that reads in a text file one word at a time. Store a word into a dynamically created array when it is first encountered. Create a parallel integer array to hold a count of the number of times that each particular word appears in the text file. If the word appears in the text file multiple times, do not add it into your dynamic array, but make sure to increment the corresponding word frequency counter in the parallel integer array. Remove any trailing punctuation from all words before doing any comparisons.
Create and use the following text file containing a quote from Bill Cosby to test your program.
I don't know the key to success, but the key to failure is trying to please everybody.
At the end of your program, generate a report that prints the contents of your two arrays in a format similar to the following:
Word Frequency Analysis
Word Frequency
I 1
don't 1
know 1
the 2
key 2
...
I can figure out if a word repeats more than once in the array, but I cannot figure out how to not add/remove that repeated word to/from the array. For instance, the word "to" appears three times, but it should only appear in the output one time (meaning it is in one spot in the array).
My code:
using namespace std;
int main()
{
ifstream file;
file.open("Quote.txt");
if (!file)
{
cout << "Error: Failed to open the file.";
}
else
{
string stringContents;
int stringSize = 0;
// find the number of words in the file
while (file >> stringContents)
{
stringSize++;
}
// close and open the file to start from the beginning of the file
file.close();
file.open("Quote.txt");
// create dynamic string arrays to hold the contents of the file
// these will be used to compare with each other the frequency
// of the words in the file
string *mainContents = new string[stringSize];
string *compareContents = new string[stringSize];
// holds the frequency of each word found in the file
int frequency[stringSize];
// initialize frequency array
for (int i = 0; i < stringSize; i++)
{
frequency[i] = 0;
}
stringContents = "";
cout << "Word\t\tFrequency\n";
for (int i = 0; i < stringSize; i++)
{
// if at the beginning of the iteration
// don't check for the reoccurence of the same string in the array
if (i == 0)
{
file >> stringContents;
// convert the current word to a c-string
// so we can remove any trailing punctuation
int wordLength = stringContents.length() + 1;
char *word = new char[wordLength];
strcpy(word, stringContents.c_str());
// set this to no value so that if the word has punctuation
// needed to remove, we can modify this string
stringContents = "";
// remove punctuation except for apostrophes
for (int j = 0; j < wordLength; j++)
{
if (ispunct(word[j]) && word[j] != '\'')
{
word[j] = '\0';
}
stringContents += word[j];
}
mainContents[i] = stringContents;
compareContents[i] = stringContents;
frequency[i] += 1;
}
else
{
file >> stringContents;
int wordLength = stringContents.length() + 1;
char *word = new char[wordLength];
strcpy(word, stringContents.c_str());
// set this to no value so that if the word has punctuation
// needed to remove, we can modify this string
stringContents = "";
for (int j = 0; j < wordLength; j++)
{
if (ispunct(word[j]) && word[j] != '\'')
{
word[j] = '\0';
}
stringContents += word[j];
}
// stringContents = "dont";
//mainContents[i] = stringContents;
compareContents[i] = stringContents;
// search for reoccurence of the word in the array
// if the array already contains the word
// don't add the word to our main array
// this is where I am having difficulty
for (int j = 0; j < stringSize; j++)
{
if (compareContents[i].compare(compareContents[j]) == 0)
{
frequency[i] += 1;
}
else
{
mainContents[i] = stringContents;
}
}
}
cout << mainContents[i] << "\t\t" << frequency[i];
cout << "\n";
}
}
file.close();
return 0;
}
I apologize if the code is difficult to understand/follow through. Any feedback is appreciated :]
If you use stl, the entire problem can be solved easily, with less coding.
#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <algorithm>
using namespace std;
int main()
{
ifstream file("Quote.txt");
string aword;
unordered_map<string,int> wordFreq;
if (!file.good()) {
cout << "Error: Failed to open the file.";
return 1;
}
else {
while( file >> aword ) {
aword.erase(remove_if(aword.begin (), aword.end (), ::ispunct), aword.end ()); //Remove Punctuations from string
unordered_map<string,int>::iterator got = wordFreq.find(aword);
if ( got == wordFreq.end() )
wordFreq.insert(std::make_pair<string,int>(aword.c_str(),1)); //insert the unique strings with default freq 1
else
got->second++; //found - increment freq
}
}
file.close();
cout << "\tWord Frequency Analyser\n"<<endl;
cout << " Frequency\t Unique Words"<<endl;
unordered_map<string,int>::iterator it;
for ( it = wordFreq.begin(); it != wordFreq.end(); ++it )
cout << "\t" << it->second << "\t\t" << it->first << endl;
return 0;
}
The algorithm that you use is very complex for such a simple task. Here is what you sahll do:
Ok, first reading pass for determining the maximum size of the
array
Then second reading pass, look directly at what to do: if string is already in the table just increment its frequency, otherwise add it to the table.
Output the table
The else block of your code would then look like:
string stringContents;
int stringSize = 0;
// find the number of words in the file
while (file >> stringContents)
stringSize++;
// close and open the file to start from the beginning of the file
file.close();
file.open("Quote.txt");
string *mainContents = new string[stringSize]; // dynamic array for strings found
int *frequency = new int[stringSize]; // dynamic array for frequency
int uniqueFound = 0; // no unique string found
for (int i = 0; i < stringSize && (file >> stringContents); i++)
{
//remove trailing punctuations
while (stringContents.size() && ispunct(stringContents.back()))
stringContents.pop_back();
// process string found
bool found = false;
for (int j = 0; j < uniqueFound; j++)
if (mainContents[j] == stringContents) { // if string already exist
frequency[j] ++; // increment frequency
found = true;
}
if (!found) { // if string not found, add it !
mainContents[uniqueFound] = stringContents;
frequency[uniqueFound++] = 1; // and increment number of found
}
}
// display results
cout << "Word\t\tFrequency\n";
for (int i=0; i<uniqueFound; i++)
cout << mainContents[i] << "\t\t" << frequency[i] <<endl;
}
Ok, it's an assignment. So you have to use arrays. Later you could sumamrize this code into:
string stringContents;
map<string, int> frequency;
while (file >> stringContents) {
while (stringContents.size() && ispunct(stringContents.back()))
stringContents.pop_back();
frequency[stringContents]++;
}
cout << "Word\t\tFrequency\n";
for (auto w:frequency)
cout << w.first << "\t\t" << w.second << endl;
and even have the words sorted alphabetically.
Depending on whether or not your assignment requires that you use an 'array', per se, you could consider using a std::vector or even a System::Collections::Generic::List for C++/CLI.
Using vectors, your code might look something like this:
#include <vector>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int wordIndex(string); //Protoype a function to check if the vector contains the word
void processWord(string); //Prototype a function to handle each word found
vector<string> wordList; //The dynamic word list
vector<int> wordCount; //The dynamic word count
void main() {
ifstream file("Quote.txt");
if (!file) {
cout << "Error: Failed to read file" << endl;
} else {
//Read each word into the 'word' variable
string word;
while (!file.eof()) {
file >> word;
//Algorithm to remove punctuation here
processWord(word);
}
}
//Write the output to the console
for (int i = 0, j = wordList.size(); i < j; i++) {
cout << wordList[i] << ": " << wordCount[i] << endl;
}
system("pause");
return;
}
void processWord(string word) {
int index = wordIndex(word); //Get the index of the word in the vector - if the word isn't in the vector yet, the function returns -1.
//This serves a double purpose: Check if the word exsists in the vector, and if it does, what it's index is.
if (index > -1) {
wordCount[index]++; //If the word exists, increment it's word count in the parallel vector.
} else {
wordList.push_back(word); //If not, add a new entry
wordCount.push_back(1); //in both vectors.
}
}
int wordIndex(string word) {
//Iterate through the word list vector
for (int i = 0, j = wordList.size(); i < j; i++) {
if (wordList[i] == word) {
return i; //The word has been found. return it's index.
}
}
return -1; //The word is not in the vector. Return -1 to tell the program that the word hasn't been added yet.
}
I've tried to annotate any new code/concepts with comments to make it easy to understand, so hopefully you can find it useful.
As a side note, you may notice that I've moved a lot of the repetative code out of the main function and into other functions. This allows for more efficient and readable coding because you can divide each problem into easily manageable, smaller problems.
Hope this can be of some use.