c++ counting how many words in line - c++

I'm using this code to count lines of the text, but I need to count also words and show to console how many words is in each row.
int main(int argc, char *argv[]){
ifstream f1("text.txt"); ;
char c;
string b;
int numchars[10] = {}, numlines = 0;
f1.get(c);
while (f1) {
while (f1 && c != '\n') {
// here I want to count how many words is in row
}
cout<<"in row: "<< numlines + 1 <<"words: "<< numchars[numlines] << endl;
numlines = numlines + 1;
f1.get(c);
}
f1.close();
system("PAUSE");
return EXIT_SUCCESS;
}

To count the number of lines and number of words you could try and brake it down to two simple tasks: first read each line from the text using getline() and , secondly, extract each word from a line using stringstream, after each successful (read line or extract word) action you could increment two variables that represent the number of lines and words.
The above could be implement like so:
ifstream f1("text.txt");
// check if file is successfully opened
if (!f1) cerr << "Can't open input file.";
string line;
int line_count = 0
string word;
int word_count = 0;
// read file line by line
while (getline(f1, line)) {
// count line
++line_count;
stringstream ss(line);
// extract all words from line
while (ss >> word) {
// count word
++word_count;
}
}
// print result
cout << "Total Lines: " << line_count <<" Total Words: "<< word_count << endl;

Related

How to read a file line by line and separate the lines components?

I am new to C++ (I usually use Java) and am trying to make a k-ary heap. I want to insert values from a file into the heap; however, I am at a loss with the code for the things I want to do.
I wanted to use .nextLine and .hasNextLine like I would in Java with a scanner, but I am not sure those are applicable to C++. Also, in the file the items are listed as such: "IN 890", "IN 9228", "EX", "IN 847", etc. The "IN" portion tells me to insert and the "EX" portion is for my extract_min. I don't know how to separate the string and integer in C++ so I can insert just the number though.
int main(){
BinaryMinHeap h;
string str ("IN");
string str ("EX");
int sum = 0;
int x;
ifstream inFile;
inFile.open("test.txt");
if (!inFile) {
cout << "Unable to open file";
exit(1); // terminate with error
}
while (inFile >> x) {
sum = sum + x;
if(str.find(nextLin) == true //if "IN" is in line)
{
h.insertKey(nextLin); //insert the number
}
else //if "EX" is in line perform extract min
}
inFile.close();
cout << "Sum = " << sum << endl;
}
The result should just add the number into the heap or extract the min.
Look at the various std::istream implementations - std::ifstream, std::istringstream, etc. You can call std::getline() in a loop to read a std::ifstream line by line, using std::istringstream to parse each line. For example:
int main() {
BinaryMinHeap h;
string line, item;
int x sum = 0;
ifstream inFile;
inFile.open("test.txt");
if (!inFile) {
cout << "Unable to open file";
return 1; // terminate with error
}
while (getline(inFile, line)) {
istringstream iss(line);
iss >> item;
if (item == "IN") {
iss >> x;
sum += x;
h.insertKey(x);
}
else if (item == "EX") {
// perform extract min
}
}
inFile.close();
cout << "Sum = " << sum << endl;
return 0;
}

Word Counting Program C++

I'm currently trying to program a word counting program in C++ and am running into difficulties getting it to parse through a string and separate words from one another. In addition to this I am having a hard time getting the word count for unique words to increment each time the word repeats. My findWord() and DistinctWords() functions are most likely the issues from what I can tell. Perhaps you will see something I do not though in the others, as for the aforementioned functions I have no clue as to what's messing up in them. These are the directions provided by my instructor:
Create a program which will count and report on the number of occurrences of distinct, case insensitive words in a text file.
The program should have a loop that:
1.Prompts the user to enter a file name. Terminates the loop and the program if the user presses the Enter key only.
2.Verifies that a file with the name entered exists. If the file does not exist, display an appropriate message and return to step 1.
3.Reads and displays the contents of the file.
4.Displays a count of the distinct words in the file.
5.Displays a sorted list of each of the distinct words in the file and the number of occurrences of each word. Sort the list in descending order by word count, ascending order by word.
I am pretty stuck right now and my assignment is due at midnight. Help would certainly be greatly appreciated. Thank you for your time. Here is the code I have, I will also copy paste an example test text file after it:
#include <iostream>
#include <iomanip>
#include <string>
#include <fstream> // Needed to use files
#include <vector>
#include <algorithm> // Needed for sort from standard libraries
using namespace std;
struct WordCount{
string word; // Word
int count; // Occurence #
void iCount(){ count++; }
WordCount(string s){ word = s; count = 1;}
};
// Function prototypes
string InputText(); // Get user file name and get text from said file
string Normalize(string); // Convert string to lowercase and remove punctuation
vector<WordCount> DistinctWords(string); // Sorted vector of word count structures
bool findWord(string, vector<WordCount>); // Linear search for word in vector of structures
void DisplayResults(vector<WordCount>); // Display results
// Main
int main(int argc, char** argv) {
// Program Title
cout << "Lab 9 - Text File Word Counter\n";
cout << "-------------------------------\n\n";
// Input text from file
string buffer = InputText();
while (buffer != ""){
// Title for text file reading
cout << "\nThis is the text string read from the file\n";
cout << "-------------------------------------------\n";
cout << buffer << endl << endl;
// Build vector of words and counts
vector<WordCount> words = DistinctWords(buffer);
// Display results
cout << "There are " << words.size() << " unique words in the above text." << endl;
cout << "--------------------------------------------" << endl << endl;
DisplayResults(words);
buffer = InputText();
}
return 0;
}
/***********************************************
InputText() -
Gets user file name and gets text from the file.
************************************************/
string InputText(){
string fileName;
ifstream inputFile; // Input file stream object
string str; // Temporary string
string text; // Text file string
cout << "File name? ";
getline(cin, fileName);
// Case to terminate the program for enter key
if (fileName.empty()){ exit(0);}
// Open file
inputFile.open(fileName);
if (!inputFile){
cout << "Error opening data file\n";
cout << "File name? "; cin >> fileName;
}
else{
while (!inputFile.eof()){
getline(inputFile, str);
text += str;
}
}
inputFile.close(); return text;
}
/****************************************************
Normalize(string) -
Converts string to lowercase and removes punctuation.
*****************************************************/
string Normalize(string s){
// Initialize variables
string nString;
char c;
// Make all text lowercase
for (int i = 0; i < s.length(); i++){
c = s[i];
c = tolower(c);
nString += c;
}
// Remove punctuation
for (int i = 0; i < nString.length(); i++){
if (ispunct(nString[i]))
nString.erase(i, 1);
}
// Return converted string
return nString;
}
/******************************************
vector<WordCount> DistinctWords(string) -
Sorts vector of word count structures.
*******************************************/
vector<WordCount> DistinctWords(string s){
vector<WordCount> words; // Initialize vector for words
string nString = Normalize(s); // Convert passed string to lowercase and remove punctuation
// Parse string
istringstream iss(nString);
while(iss >> nString){
string n; // Intialize temporary string
iss >> n; // Put word in n
if (findWord(n, words) == true){ continue; } // Check to verify that there is no preexisting occurence of the word passed
else{
WordCount tempO(n); // Make structure object with n
words.push_back(tempO); // Push structure object into words vector
}
}
return words;
}
/*********************************************
bool findWord(string, vector<WordCount>) -
Linear search for word in vector of structures
**********************************************/
bool findWord(string s, vector<WordCount> words){
// Search through vector
for (auto r : words){
if (r.word == s){ // Increment count of object if found again
r.iCount(); return true;
}
else // Go back to main function if not found
return false;
}
}
/***********************************************
void DisplayResults(vector<WordCount>) -
Displays results.
************************************************/
void DisplayResults(vector<WordCount> words){
// TROUBLESHOOT FIRST ERASE THIS AFTER!!!!!
cout << "Word" << setw(20) << "Count\n";
cout << "-----------------------\n";
for (auto &r : words){
cout << setw(6) << left << r.word;
cout << setw(15) << right << r.count << endl;
}
}
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to heaven, we were all going direct the other way - in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only.
This is the example display he provided for this particular test file
You almost had it!
You just forgot to pass the 'words' vector by reference instead of by copy.
Also I included a custom comparator for the sort at the end.
#include <iostream>
#include <sstream>
#include <iomanip>
#include <string>
#include <fstream> // Needed to use files
#include <vector>
#include <algorithm> // Needed for sort from standard libraries
using namespace std;
struct WordCount{
string word; // Word
int count; // Occurence #
void iCount(){ count++; }
WordCount(string s){ word = s; count = 1;}
};
struct {
bool operator()(const WordCount& a, const WordCount& b)
{
if (a.count < b.count)
return false;
else if (a.count > b.count)
return true;
else{
if (a.word < b.word)
return true;
else
return false;
}
}
} CompareWordCount;
// Function prototypes
string InputText(); // Get user file name and get text from said file
string Normalize(string); // Convert string to lowercase and remove punctuation
vector<WordCount> DistinctWords(string); // Sorted vector of word count structures
bool findWord(string, vector<WordCount>&); // Linear search for word in vector of structures
void DisplayResults(vector<WordCount>); // Display results
// Main
int main(int argc, char** argv) {
// Program Title
cout << "Lab 9 - Text File Word Counter\n";
cout << "-------------------------------\n\n";
// Input text from file
string buffer = InputText();
while (buffer != ""){
// Title for text file reading
cout << "\nThis is the text string read from the file\n";
cout << "-------------------------------------------\n";
cout << buffer << endl << endl;
// Build vector of words and counts
vector<WordCount> words = DistinctWords(buffer);
// Display results
cout << "There are " << words.size() << " unique words in the above text." << endl;
cout << "--------------------------------------------" << endl << endl;
DisplayResults(words);
buffer = InputText();
buffer = "";
}
return 0;
}
/***********************************************
InputText() -
Gets user file name and gets text from the file.
************************************************/
string InputText(){
string fileName;
ifstream inputFile; // Input file stream object
string str; // Temporary string
string text; // Text file string
cout << "File name? ";
getline(cin, fileName);
// Case to terminate the program for enter key
if (fileName.empty()){ exit(0);}
// Open file
inputFile.open(fileName);
if (!inputFile){
cout << "Error opening data file\n";
cout << "File name? "; cin >> fileName;
}
else{
while (!inputFile.eof()){
getline(inputFile, str);
text += str;
}
}
inputFile.close(); return text;
}
/****************************************************
Normalize(string) -
Converts string to lowercase and removes punctuation.
*****************************************************/
string Normalize(string s){
// Initialize variables
string nString;
char c;
// Make all text lowercase
for (int i = 0; i < s.length(); i++){
c = s[i];
c = tolower(c);
if (isalpha(c) || isblank(c))
nString += c;
}
// Return converted string
return nString;
}
/******************************************
vector<WordCount> DistinctWords(string) -
Sorts vector of word count structures.
*******************************************/
vector<WordCount> DistinctWords(string s){
vector<WordCount> words; // Initialize vector for words
string nString = Normalize(s); // Convert passed string to lowercase and remove punctuation
// Parse string
istringstream iss(nString);
string n; // Intialize temporary string
while(iss >> n){
if (findWord(n, words) == true){ continue; } // Check to verify that there is no preexisting occurence of the word passed
else{
WordCount tempO(n); // Make structure object with n
words.push_back(tempO); // Push structure object into words vector
}
}
return words;
}
/*********************************************
bool findWord(string, vector<WordCount>) -
Linear search for word in vector of structures
**********************************************/
bool findWord(string s, vector<WordCount>& words){
// Search through vector
for (auto& r : words){
if (r.word.compare(s) == 0){ // Increment count of object if found again
r.iCount(); return true;
}
}
}
/***********************************************
void DisplayResults(vector<WordCount>) -
Displays results.
************************************************/
void DisplayResults(vector<WordCount> words){
// TROUBLESHOOT FIRST ERASE THIS AFTER!!!!!
cout << "Word" << setw(20) << "Count\n";
cout << "-----------------------\n";
sort(words.begin(), words.end(),CompareWordCount);
for (auto &r : words){
cout << setw(6) << left << r.word;
cout << setw(15) << right << r.count << endl;
}
}
Consider using map for word count task
int main()
{
map<string, int> wordCount;
vector<string> inputWords = {"some", "test", "stuff", "test",
"stuff"}; //read from file instead
for(auto& s: inputWords)
wordCount[s]++; //wordCount itself
for(auto& entry: wordCount) //print all words and assosiated counts
cout << entry.first << " " << entry.second <<endl;
cout <<wordCount.size() <<endl; //thats number of distinct words
}

C++ Tokenize part of a String

Trying to tokenize a String in C++ which is read from a file, separated by commas, but I only need the first 3 data of every line.
For example:
The lines look like this:
140,152,2240,1,0,3:0:0:0:
156,72,2691,1,0,1:0:0:0:
356,72,3593,1,0,1:0:0:0:
But I only need the first 3 data of these lines. In this case:
140, 152, 2240156, 72, 2691356, 72, 3593
I'm trying to add these data into a vector I just don't know how to skip reading a line from the file after the first 3 data.
This is my current code: (canPrint is true by default)
ifstream ifs;
ifs.open("E:\\sample.txt");
if (!ifs)
cout << "Error reading file\n";
else
cout << "File loaded\n";
int numlines = 0;
int counter = 0;
string tmp;
while (getline(ifs, tmp))
{
//getline(ifs, tmp); // Saves the line in tmp.
if (canPrint)
{
//getline(ifs, tmp);
numlines++;
// cout << tmp << endl; // Prints our tmp.
vector<string> strings;
vector<customdata> datalist;
istringstream f(tmp);
string s;
while (getline(f, s, ',')) {
cout << s << " ";
strings.push_back(s);
}
cout << "\n";
}
How about checking the size of the vector first? Perhaps something like
while (strings.size() < 3 && getline(f, s, ',')) { ... }

Counting lines from a file input?

The following code is supposed to count: the lines, the characters and the words read from a text file.
Input text file:
This is a line.
This is another one.
The desired output is:
Words: 8
Chars: 36
Lines: 2
However, the word count comes out to 0 and if I change it then lines and characters come out to 0 and the word count is correct. I am getting this:
Words: 0
Chars: 36
Lines: 2
This is my code:
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
int main()
{
ifstream inFile;
string fileName;
cout << "Please enter the file name " << endl;
getline(cin,fileName);
inFile.open(fileName.c_str());
string line;
string chars;
int number_of_lines = 0;
int number_of_chars = 0;
while( getline(inFile, line) )
{
number_of_lines++;
number_of_chars += line.length();
}
string words;
int number_of_words = 0;
while (inFile >> words)
{
number_of_words++;
}
cout << "Words: " << number_of_words <<"" << endl;
cout << "Chars: " << number_of_chars <<"" << endl;
cout << "Lines: " << number_of_lines <<"" << endl;
return 0;
}
Any guidance would be greatly appreciated.
And because Comments are often unread by answer seekers...
while( getline(inFile, line) )
Reads through the entire file. When it's done inFile's read location is set to the end of the file so the word counting loop
while (inFile >> words)
starts reading at the end of the file and finds nothing. The smallest change to the code to make it perform correctly is to use seekg rewind the file before counting the words.
inFile.seekg (0, inFile.beg);
while (inFile >> words)
Positions the reading location to file offset 0 relative to the beginning of the file (specified by inFile.beg) and then reads through the file to count the words.
While this works, it requires two complete reads through the file, which can be quite slow. A better option suggested by crashmstr in the comments and implemented by simplicis veritatis as another answer requires one read of the file to get and count lines, and then an iteration through each line in RAM to count the number of words.
This has the same number of total iterations, everything must be counted one by one, but reading from a buffer in memory is preferable to reading from disk due to significantly faster, orders of magnitude, access and response times.
Here is one possible implementation (not tested) to use as a benchmark:
int main(){
// print prompt message and read input
cout << "Please enter the file name " << endl;
string fileName;
getline(cin,fileName);
// create an input stream and attach it to the file to read
ifstream inFile;
inFile.open(fileName.c_str());
// define counters
string line;
string chars;
int number_of_lines = 0;
int number_of_chars = 0;
vector<string> all_words;
do{
getline(inFile, line);
// count lines
number_of_lines++;
// count words
// separates the line into individual words, uses white space as separator
stringstream ss(line);
string word;
while(ss >> word){
all_words.push_back(word);
}
}while(!inFile.eof())
// count chars
// length of each word
for (int i = 0; i < all_words.size(); ++i){
number_of_chars += all_words[i].length();
}
// print result
cout << "Words: " << all_words.size() <<"" << endl;
cout << "Chars: " << number_of_chars <<"" << endl;
cout << "Lines: " << number_of_lines <<"" << endl;
return 0;
}

How to comma separate a string read from a file and then saving it in an array

This is the text file that i have created
NameOfProduct,Price,Availability.
Oil,20$,yes
Paint,25$,yes
CarWax,35$,no
BrakeFluid,50$,yes
I want to read this data from the file line by line and then split it on the comma(,) sign and save it in an array of string.
string findProduct(string nameOfProduct)
{
string STRING;
ifstream infile;
string jobcharge[10];
infile.open ("partsaval.txt"); //open the file
int x = 0;
while(!infile.eof()) // To get you all the lines.
{
getline(infile,STRING); // Saves the line in STRING.
stringstream ss(STRING);
std::string token;
while(std::getline(ss, token, ','))
{
//std::cout << token << '\n';
}
}
infile.close(); // closing the file for safe handeling if another process wantst to use this file it is avaliable
for(int a= 0 ; a < 10 ; a+=3 )
{
cout << jobcharge[a] << endl;
}
}
The problem:
when i remove the comment on the line that print token, all of the data is printed perfectly , however when i try to print the contents of the array(jobcharge[]) it doesn't print anything.
You cannot save the lines inside the array, it can only contain one string per cell and you want to put 3, also you forgot to add the elements inside the array:
You need a 2D array:
string jobcharge[10][3];
int x = 0;
while(!infile.eof()) // To get you all the lines.
{
getline(infile,STRING); // Saves the line in STRING.
stringstream ss(STRING);
std::string token;
int y = 0;
while(std::getline(ss, token, ','))
{
std::cout << token << '\n';
jobcharge[x][y] = token;
y++;
}
x++;
}
Then you can print the array like this:
for(int a= 0 ; a < 10 ; a++ )
{
for(int b= 0 ; b < 3 ; b++ )
{
cout << jobcharge[a][b] << endl;
}
}
Bear in mind that this code will completely fail is you have more than 10 lines or more than 3 items per line. You should check the values inside the loop.
you can fscanf() instead
char name[100];
char price[16];
char yesno[4];
while (fscanf(" %99[^,] , %15[^,] , %3[^,]", name, price, yesno)==3) {
....
}