reading data from text files in c++ and changing the values - c++

I am stuck at this normalization part. I am using a string and reading data one by one. I keep getting blank. The program compiles. Any hints to what to do next would be awesome. How would I complete the 5 steps below? Steps 3 and 4 work fine.
A program that reads a text file using character-by-character I/O and performs the following normalization tasks:
Replaces all tab characters with 8 spaces
Replaces all upper-case letters with lower-case letters
All # symbols will be replaced by the word "at".
All = signs will be replaced by a series of 19 = signs.
When you find an asterisk, you will print a series of asterisks. The character following the asterisk indicates the number of asterisks to print. Use the ASCII value of the character following. Number of asterisks is ASCII value minus 32 plus 1. The character following the asterisk is used only as a counter, not a data character.
.
#include <iostream>
#include <fstream>
#include <string>
#include <cstring>
using namespace std;
int main() {
ifstream fin;
ofstream fout;
char fname[256];
char ofname[256];
char norm[256] = ".normal";
int count = 0;
//Opening input and output file
cout << "What file do you want to be normalized? \n";
cin >> fname;
cout << "\n";
fin.open(fname);
if (fin.fail()) {
cout << "Error opening input file! \n";
return 0;
}
strcpy( ofname, fname);
strcat( ofname, norm);
fout.open(ofname);
if (fout.fail()) {
cout << "Error opening output file! \n";
return 0;
}
cout << "Your output file name is: " << ofname << "\n";
//Normalization begins here
char data;
while (fin.get(data)) {
if (data == "\t") { //***
fout << " ";
}// else if (isupper(data)) { //***
// fout << tolower(data); //***
else if (data == "#") {
fout << "at";
} else if (data == "=") {
fout << "===================";
} else if (data == "*") {
fout << "some shit";
}
}
fin.close();
fout.close();
return 0;
}
[/code]

You were on the right track. Rather than a long winded explanation going line-by-line, I've included comments below. Your primary challenge is you were trying to read string data; where the intent of the problem seems to require char data; Also in reading character-by-character you need to include the stream modifier noskipws to insure you do not skip over whitespace characters. There are many, many ways to do this. This is just one example to compare against the approach you are taking:
#include <iostream>
#include <fstream>
#include <string>
#include <cstring>
using namespace std;
int main () {
ifstream fin;
ofstream fout;
char fname[256];
char ofname[256];
char norm[256] = ".normal";
char eq[] = "==================="; // set a convenient 19 char sting of '='
//Opening input and output file
cout << endl << " Enter name of file to be normalized: ";
cin >> fname;
cout << endl;
fin.open (fname);
if (fin.fail ()) {
cout << "Error opening input file! \n";
return 0;
}
strcpy (ofname, fname);
strcat (ofname, norm);
fout.open (ofname);
if (fout.fail ()) {
cout << "Error opening output file! \n";
return 0;
}
cout << endl << " Your output file name is: " << ofname << endl << endl;
//Normalization begins here
char data; // declare data as 'char' not 'string'
fin >> noskipws >> data; // read each char (including whitespace)
while (!fin.eof ()) {
switch (data)
{
case '\t' : // replace 'tab' by '8 chars'
fout << " ";
break;
case '#' : // replace '#' by 'at'
fout << "at";
break;
case '=' : // replace '=' by series of 19 '='
fout << eq;
break;
case '*' : // replace '*n' by series of (ascii n - 31) '*'
// fin >> count;
fin >> data; // read next value
if (fin.eof ()) // test if eof set
break;
for (int it=0; it < data - 31; it++) // output calculate number of asterisks
fout << '*';
break;
default: // use default case to proccess all data and
if (isupper (data)) { // test upper/lower-case.
char lc = tolower (data);
fout << lc;
} else {
fout << data;
}
}
fin >> noskipws >> data; // read the next character
}
fin.close (); // close files & return
fout.close ();
return 0;
}
Test Input:
$ cat dat/test.dat
A program that reads a text:
Tab ' ' -> 8 spaces.
U-C letters -> l-c letters.
All # -> "at".
All = signs -> a series of 19 = signs.
All 'asterisks''n' like '*6'. -> n series of asterisks
(Where Number of Asterisks is ASCII value minus 32 plus 1).
Output:
$ cat dat/test.dat.normal
a program that reads a text:
tab ' ' -> 8 spaces.
u-c letters -> l-c letters.
all at -> "at".
all =================== signs -> a series of 19 =================== signs.
all 'asterisks''n' like '***********************'. -> n series of asterisks
(where number of asterisks is ascii value minus 32 plus 1).

Related

inData.open and inData.close in C++

I found the properly way to add a while so I can complete this exercise. However, there are 2 things that require a touch. The file output is displaying twice. The first time properly, and the second time in a single line ( I don't need this line to show up). The second issue is the account ++ function. It has to display the counting of 7 words but it's counting 8 instead. Why? could you help me with this. the issue is in the last while.
#include<iostream>
#include<fstream>//step#1
#include<string>
using namespace std;
int main()
{
string word, fileName;
int charcounter = 0, wordcounter = 0;
char character;
ifstream inData;// incoming file stream variable
cout << " Enter filename or type quit to exit: ";
cin >> fileName;
//loop to allow for multiple files data reads
while (fileName != "quit")
{
inData.open(fileName.c_str());//open file and bind file to ifstream variable
//loop for file not found validation
while (!inData)//filestream is in fail state due to no file
{
inData.clear();//clear the fail state
cout << "File not found. Enter the correct filename: ";
cin >> fileName;
inData.open(fileName.c_str());
}
inData >> character;//extract a single character from the file
cout << "\n*****************************\n";
while (inData)
{
cout << character;
inData.get(character);//extract the next character and the next character
charcounter++;
}
//Here is the loop that is missing something
//I was told to close the file
inData.close();
//open up the file again and add the while loop
inData.open(fileName.c_str());
while (inData)
{
cout << word;
inData >> word;//extract the next word and the next word
wordcounter++;
}
cout << "\n******************************\n";
cout << fileName << " has " << wordcounter << " words" << endl;
inData.close();//close the ifstream conection to the data file
charcounter = 0; //reset char and word counts
wordcounter = 0;
//port for next file or exit
cout << "Enter a filename or type quit to exit: ";
cin >> fileName;
}
return 0;
}
The reason you are getting redundant output is you are outputting the contents of the file twice, e.g.
lines 39 - 43
while (inData)
{
cout << character;
...
lines 57 - 61
while (inData)
{
cout << word;
...
Whether you output character-by-character, or word-by-word, you are outputting the contents of the file. Doing it once character-by-character in one loop and then doing it again word-by-word in another results in twice the output.
Further, there is no need to loop over the file twice to count the characters and then words -- do it all in a single loop, e.g.
int charcounter = 0,
wordcounter = 0,
inword = 0; /* flag indicating reading chars in word */
...
while (inData.get(character)) { /* read each character in file */
if (isspace (character)) /* if character is whitespace */
inword = 0; /* set inword flag zero */
else { /* if non-whitespace */
if (!inword) /* if not already in word */
wordcounter++; /* increment wordcounter */
inword = 1; /* set inword flag 1 */
}
charcounter++; /* increment charcounter */
}
The remaining problems you have are just due to the jumbled loop logic you try and employ to be able to open different files until the user types "quit". You only need one outer-loop that will loop continually until the user types "quit". You don't need multiple checks and multiple prompts for the filename. Simply use a single loop, e.g.
for (;;) { /* loop continually until "quit" entered as fileName */
string word, fileName; /* fileName, character, inData are */
char character; /* only needed within loop */
int charcounter = 0,
wordcounter = 0,
inword = 0; /* flag indicating reading chars in word */
ifstream inData;
cout << "\nEnter filename or type quit to exit: ";
if (!(cin >> fileName)) { /* validate every read */
cerr << "(error: fileName)\n";
return 1;
}
if (fileName == "quit") /* test for quit */
return 0;
inData.open (fileName); /* no need for .c_str() */
if (!inData.is_open()) { /* validate file is open */
cerr << "error: unable to open " << fileName << '\n';
continue;
}
... /* the read loop goes here */
inData.close(); /* closing is fine, but will close at loop end */
cout << '\n' << fileName << " has " << wordcounter << " words and "
<< charcounter << " characters\n";
}
Making those changes cleans up your program flow and makes the loop logic straight-forward. Putting it altogether you could do:
#include <iostream>
#include <fstream>
#include <string>
#include <cctype>
using namespace std;
int main (void) {
for (;;) { /* loop continually until "quit" entered as fileName */
string word, fileName; /* fileName, character, inData are */
char character; /* only needed within loop */
int charcounter = 0,
wordcounter = 0,
inword = 0; /* flag indicating reading chars in word */
ifstream inData;
cout << "\nEnter filename or type quit to exit: ";
if (!(cin >> fileName)) { /* validate every read */
cerr << "(error: fileName)\n";
return 1;
}
if (fileName == "quit") /* test for quit */
return 0;
inData.open (fileName); /* no need for .c_str() */
if (!inData.is_open()) { /* validate file is open */
cerr << "error: unable to open " << fileName << '\n';
continue;
}
while (inData.get(character)) { /* read each character in file */
if (isspace (character)) /* if character is whitespace */
inword = 0; /* set inword flag zero */
else { /* if non-whitespace */
if (!inword) /* if not already in word */
wordcounter++; /* increment wordcounter */
inword = 1; /* set inword flag 1 */
}
charcounter++; /* increment charcounter */
}
inData.close(); /* closing is fine, but will close at loop end */
cout << '\n' << fileName << " has " << wordcounter << " words and "
<< charcounter << " characters\n";
}
}
Example Input File
$ cat ../dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Example Use/Output
$ ./bin/char_word_count
Enter filename or type quit to exit: ../dat/captnjack.txt
../dat/captnjack.txt has 16 words and 76 characters
Enter filename or type quit to exit: quit
Confirmation with wc
$ wc ../dat/captnjack.txt
4 16 76 ../dat/captnjack.txt
Look things over and let me know if you have additional questions.

C++ Read String from Text file and save word by word into linkedlist

I am assigned to write a c++ program for room booking system. I know how to read a text file line by line and save it. but my problem is how to read text file word by word.
This is the text file I have:
1-Reserved-2018-12-23-Lecture Room-13
2-Reserved-2018-11-34-Tutorial Room-15
3-Not Reserved-0-0-0-Design Studio-18
4-Reserved-2018-11-16-Lecture Room-14
5-Not Reserved-0-0-0-Exam Hall-18
I want to read the text file and save the words into it nodes. (Like ID, data, typeofRoom, etc.) Is there any way to do so in C++?
This is my class:
class room {
public:
int length;
int initial;
enum class roomType { main_hall, exam_hall, lecture_room, tutorial_room, design_studio, meeting_room };
struct node {
string data;
int id;
int capacity;
int year, month, day;
int deleteDate;
roomType type;
node* next;
};
node* front;
node * tail;
room() {
length=0;
initial=1;
front = NULL;
tail = NULL;
}
bool isFull () { return length>=20; }
// Add Rooms
void room::addRoom() {
system("cls");
if (isFull()) {
cout<<" No more than 25 rooms are allowed\n"<<endl;
return;
}
cout << "Enter the capacity" << endl;
int a;
cin >> a;
node* temp = new node();
temp->data = "Not Reserved";
temp->id = initial;
temp->year = 0;
temp->month = 0;
temp->day = 0;
temp->deleteDate = 0;
initial++;
temp->capacity = a;
temp->next = NULL;
if (front == NULL && tail == NULL)
{
front = temp;
tail = temp;
}
else {
tail->next = temp;
tail = temp;
}
cout << "Choose The type" << endl;
cout << "1- Main Hall \t 2- Lecture Room \t 3- Exam Hall \t 4- Meeting Room \t 5- Design Studio \t 6- Tutorial Room" << endl;
int t;
cin >> t;
if (t == 1)
{
temp->type = roomType::main_hall;
}
else if (t == 2)
{
temp->type = roomType::lecture_room;
}
else if (t == 3)
{
temp->type = roomType::exam_hall;
}
else if (t == 4)
{
temp->type = roomType::meeting_room;
}
else if (t == 5)
{
temp->type = roomType::design_studio;
}
else if (t == 6)
{
temp->type = roomType::tutorial_room;
}
else {cout << "Wrong Input!" << endl;}
length++;
cout<<"\n Successfully Created!\n\n";
system("pause");
save();
}
void reserveRoom()
{
system("cls");
show();
cout << "=============================================================" << endl;
cout << "Enter the room ID you want to Book !" << endl;
int id;
cin >> id;
node* tmp = front;
while (tmp != NULL) {
if (tmp->id == id) {
if(tmp->data == "Not Reserved"){
tmp->data = "Reserved";
int y,m,d;
cout << "Enter the year " << endl;
cin >> y;
cout << "Enter the month " << endl;
cin >> m;
cout << "Enter the day " << endl;
cin >> d;
tmp->year = y;
tmp->month = m;
tmp->day = d;
tmp->deleteDate = y+m+d;
cout << "Room Reserved!" << endl;
}
else{
cout << "This room has been reserved!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" << endl;
}
}
tmp = tmp->next;
}
system("pause");
room::save();
}
};
I know how to read a text file line by line and save it. but my
problem is how to read text file word by word.
This is the text file I have:
1-Reserved-2018-12-23-Lecture Room-13
2-Reserved-2018-11-34-Tutorial Room-15
3-Not Reserved-0-0-0-Design Studio-18
4-Reserved-2018-11-16-Lecture Room-14
5-Not Reserved-0-0-0-Exam Hall-18
Continuing from my comment above, whenever you are faced with separating words in a line with a delimiter in between the words, a standard approach is the each line into a string with getline and then create a stringstream from the line reading each word into a string using getline with the delimiter specified.
Why Read a Line with getline and Read a stringstream with getline Again?
Answer: line-control.
While you could simply read directly from your file using getline and a delimiter, which would separate each word, how would you know when one line ended and the next line began? When you specify the delimiter to use with getline, getline will read until the delimiter is found or end of input or str.max_size characters have been read. See cppreference.com - std::getline. So there is no special meaning to the line-ending '\n' in this case.
However, if you read the entire line into a string and then create a stringstream from the line, you know you can only read until the end-of-line as that will trigger the end-of-file condition on input. So even though you are using getline with a delimiter, it can now only read as far as the end of line.
A short example using this approach and using your data file will show how you can separate each line into words that you can then add to each node of your list, e.g.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
int main (int argc, char **argv) {
std::string line; /* string to hold each line */
if (argc < 2) { /* validate at least 1 argument given */
std::cerr << "error: insufficient input.\n"
"usage: " << argv[0] << " filename\n";
return 1;
}
std::ifstream f (argv[1]); /* open file */
if (!f.is_open()) { /* validate file open for reading */
perror (("error opening file " + std::string(argv[1])).c_str());
return 1;
}
while (getline (f, line)) { /* read each line into line */
std::string word; /* string to hold words */
std::stringstream s (line); /* create stringstream from line */
while (getline (s, word, '-')) /* read hyphen separated words */
std::cout << word << '\n'; /* output words */
std::cout << '\n'; /* tidy up with newline between data */
}
}
Example Input File
$ cat ../dat/hyphenstr.txt
1-Reserved-2018-12-23-Lecture Room-13
2-Reserved-2018-11-34-Tutorial Room-15
3-Not Reserved-0-0-0-Design Studio-18
4-Reserved-2018-11-16-Lecture Room-14
5-Not Reserved-0-0-0-Exam Hall-18
Example Use/Output
Note, the code above simply outputs an additional '\n' between the words separated from each line. You would write logic (perhaps using a counter, and, e.g. stoi for any needed conversions) to convert the values to integer values and store each in its proper field.
$ ./bin/getline_hyphen ../dat/hyphenstr.txt
1
Reserved
2018
12
23
Lecture Room
13
2
Reserved
2018
11
34
Tutorial Room
15
3
Not Reserved
0
0
0
Design Studio
18
4
Reserved
2018
11
16
Lecture Room
14
5
Not Reserved
0
0
0
Exam Hall
18
You can also remove the separators from each line, create a separate stringstream without the hyphens and use >> to read and convert the values for each node. (this second approach is left to you)
Look things over and let me know if you have further questions.

C++ Searching CSV file from inputted string

I am trying to create a program that will load the CSV file and based upon the inputted word search through the file and return any lines that contain the word. The CSV file is a mass download of tweets and has the following columns:
Date & Time Created
The Tweet
The tweets are also surrounded by b'TWEET TEXT HERE' so would need to remove the b' ' from when it printed out. I am unable to change anything to do with the CSV file sadly so cant manually remove it. The issues I am having are:
Listing the total amount of tweets within the file the program just freezes
Removing the b' ' from the tweets
The else statement causes "not found" to be constantly printed
Code I currently have that is returning the tweets that contain the inputted word but also the false positive.
The current output when running the below code
#include "stdafx.h"
#include <cstring>
#include <fstream>
#include <iostream>
#include <string>
using namespace std;
int main()
{
string token;
ifstream fin;
fin.open("sampleTweets.csv");
if (fin.is_open())
{
cout << "File opened successfully" << "\n";
}
else {
cout << "Error opening file" << "\n";
}
cout << "Enter search word: ";
cin >> token;
"\n";
string line;
while (getline(fin, line)) {
if (line.find(token) != string::npos) {
cout << line << endl;
} else {
cout << token << " not found" << endl;
}
}
fin.close();
char anykey;
cout << "press any key";
cin >> anykey;
return 0;
}
Code I was using for counting total tweets
int count = 0;
char str[140];
while (!fin.eof())
{
fin.getline(str, 140);
count++;
}
cout << "Number of lines in file are " << count;
Any help on this would be amazing as I am quite new to C++ and not sure where to go from here!
You can remove the "b" with erase:
if (line.find(token) != string::npos){
int n= line.find(",");
line.erase(n+1, 3);
cout << line << endl;
}
and you can count the lines inside the while loop:
int count = 0;
while (getline(fin, line)) {
++count;
...
}
EDIT: you can remove the extra quotes and commas like so:
line[n] = ' '; // change comma int space
line.erase(n+1, 4); // remove "b""
line.resize(line.size()-5); // remove trailing """,,

Word Counting Program C++

I'm currently trying to program a word counting program in C++ and am running into difficulties getting it to parse through a string and separate words from one another. In addition to this I am having a hard time getting the word count for unique words to increment each time the word repeats. My findWord() and DistinctWords() functions are most likely the issues from what I can tell. Perhaps you will see something I do not though in the others, as for the aforementioned functions I have no clue as to what's messing up in them. These are the directions provided by my instructor:
Create a program which will count and report on the number of occurrences of distinct, case insensitive words in a text file.
The program should have a loop that:
1.Prompts the user to enter a file name. Terminates the loop and the program if the user presses the Enter key only.
2.Verifies that a file with the name entered exists. If the file does not exist, display an appropriate message and return to step 1.
3.Reads and displays the contents of the file.
4.Displays a count of the distinct words in the file.
5.Displays a sorted list of each of the distinct words in the file and the number of occurrences of each word. Sort the list in descending order by word count, ascending order by word.
I am pretty stuck right now and my assignment is due at midnight. Help would certainly be greatly appreciated. Thank you for your time. Here is the code I have, I will also copy paste an example test text file after it:
#include <iostream>
#include <iomanip>
#include <string>
#include <fstream> // Needed to use files
#include <vector>
#include <algorithm> // Needed for sort from standard libraries
using namespace std;
struct WordCount{
string word; // Word
int count; // Occurence #
void iCount(){ count++; }
WordCount(string s){ word = s; count = 1;}
};
// Function prototypes
string InputText(); // Get user file name and get text from said file
string Normalize(string); // Convert string to lowercase and remove punctuation
vector<WordCount> DistinctWords(string); // Sorted vector of word count structures
bool findWord(string, vector<WordCount>); // Linear search for word in vector of structures
void DisplayResults(vector<WordCount>); // Display results
// Main
int main(int argc, char** argv) {
// Program Title
cout << "Lab 9 - Text File Word Counter\n";
cout << "-------------------------------\n\n";
// Input text from file
string buffer = InputText();
while (buffer != ""){
// Title for text file reading
cout << "\nThis is the text string read from the file\n";
cout << "-------------------------------------------\n";
cout << buffer << endl << endl;
// Build vector of words and counts
vector<WordCount> words = DistinctWords(buffer);
// Display results
cout << "There are " << words.size() << " unique words in the above text." << endl;
cout << "--------------------------------------------" << endl << endl;
DisplayResults(words);
buffer = InputText();
}
return 0;
}
/***********************************************
InputText() -
Gets user file name and gets text from the file.
************************************************/
string InputText(){
string fileName;
ifstream inputFile; // Input file stream object
string str; // Temporary string
string text; // Text file string
cout << "File name? ";
getline(cin, fileName);
// Case to terminate the program for enter key
if (fileName.empty()){ exit(0);}
// Open file
inputFile.open(fileName);
if (!inputFile){
cout << "Error opening data file\n";
cout << "File name? "; cin >> fileName;
}
else{
while (!inputFile.eof()){
getline(inputFile, str);
text += str;
}
}
inputFile.close(); return text;
}
/****************************************************
Normalize(string) -
Converts string to lowercase and removes punctuation.
*****************************************************/
string Normalize(string s){
// Initialize variables
string nString;
char c;
// Make all text lowercase
for (int i = 0; i < s.length(); i++){
c = s[i];
c = tolower(c);
nString += c;
}
// Remove punctuation
for (int i = 0; i < nString.length(); i++){
if (ispunct(nString[i]))
nString.erase(i, 1);
}
// Return converted string
return nString;
}
/******************************************
vector<WordCount> DistinctWords(string) -
Sorts vector of word count structures.
*******************************************/
vector<WordCount> DistinctWords(string s){
vector<WordCount> words; // Initialize vector for words
string nString = Normalize(s); // Convert passed string to lowercase and remove punctuation
// Parse string
istringstream iss(nString);
while(iss >> nString){
string n; // Intialize temporary string
iss >> n; // Put word in n
if (findWord(n, words) == true){ continue; } // Check to verify that there is no preexisting occurence of the word passed
else{
WordCount tempO(n); // Make structure object with n
words.push_back(tempO); // Push structure object into words vector
}
}
return words;
}
/*********************************************
bool findWord(string, vector<WordCount>) -
Linear search for word in vector of structures
**********************************************/
bool findWord(string s, vector<WordCount> words){
// Search through vector
for (auto r : words){
if (r.word == s){ // Increment count of object if found again
r.iCount(); return true;
}
else // Go back to main function if not found
return false;
}
}
/***********************************************
void DisplayResults(vector<WordCount>) -
Displays results.
************************************************/
void DisplayResults(vector<WordCount> words){
// TROUBLESHOOT FIRST ERASE THIS AFTER!!!!!
cout << "Word" << setw(20) << "Count\n";
cout << "-----------------------\n";
for (auto &r : words){
cout << setw(6) << left << r.word;
cout << setw(15) << right << r.count << endl;
}
}
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to heaven, we were all going direct the other way - in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only.
This is the example display he provided for this particular test file
You almost had it!
You just forgot to pass the 'words' vector by reference instead of by copy.
Also I included a custom comparator for the sort at the end.
#include <iostream>
#include <sstream>
#include <iomanip>
#include <string>
#include <fstream> // Needed to use files
#include <vector>
#include <algorithm> // Needed for sort from standard libraries
using namespace std;
struct WordCount{
string word; // Word
int count; // Occurence #
void iCount(){ count++; }
WordCount(string s){ word = s; count = 1;}
};
struct {
bool operator()(const WordCount& a, const WordCount& b)
{
if (a.count < b.count)
return false;
else if (a.count > b.count)
return true;
else{
if (a.word < b.word)
return true;
else
return false;
}
}
} CompareWordCount;
// Function prototypes
string InputText(); // Get user file name and get text from said file
string Normalize(string); // Convert string to lowercase and remove punctuation
vector<WordCount> DistinctWords(string); // Sorted vector of word count structures
bool findWord(string, vector<WordCount>&); // Linear search for word in vector of structures
void DisplayResults(vector<WordCount>); // Display results
// Main
int main(int argc, char** argv) {
// Program Title
cout << "Lab 9 - Text File Word Counter\n";
cout << "-------------------------------\n\n";
// Input text from file
string buffer = InputText();
while (buffer != ""){
// Title for text file reading
cout << "\nThis is the text string read from the file\n";
cout << "-------------------------------------------\n";
cout << buffer << endl << endl;
// Build vector of words and counts
vector<WordCount> words = DistinctWords(buffer);
// Display results
cout << "There are " << words.size() << " unique words in the above text." << endl;
cout << "--------------------------------------------" << endl << endl;
DisplayResults(words);
buffer = InputText();
buffer = "";
}
return 0;
}
/***********************************************
InputText() -
Gets user file name and gets text from the file.
************************************************/
string InputText(){
string fileName;
ifstream inputFile; // Input file stream object
string str; // Temporary string
string text; // Text file string
cout << "File name? ";
getline(cin, fileName);
// Case to terminate the program for enter key
if (fileName.empty()){ exit(0);}
// Open file
inputFile.open(fileName);
if (!inputFile){
cout << "Error opening data file\n";
cout << "File name? "; cin >> fileName;
}
else{
while (!inputFile.eof()){
getline(inputFile, str);
text += str;
}
}
inputFile.close(); return text;
}
/****************************************************
Normalize(string) -
Converts string to lowercase and removes punctuation.
*****************************************************/
string Normalize(string s){
// Initialize variables
string nString;
char c;
// Make all text lowercase
for (int i = 0; i < s.length(); i++){
c = s[i];
c = tolower(c);
if (isalpha(c) || isblank(c))
nString += c;
}
// Return converted string
return nString;
}
/******************************************
vector<WordCount> DistinctWords(string) -
Sorts vector of word count structures.
*******************************************/
vector<WordCount> DistinctWords(string s){
vector<WordCount> words; // Initialize vector for words
string nString = Normalize(s); // Convert passed string to lowercase and remove punctuation
// Parse string
istringstream iss(nString);
string n; // Intialize temporary string
while(iss >> n){
if (findWord(n, words) == true){ continue; } // Check to verify that there is no preexisting occurence of the word passed
else{
WordCount tempO(n); // Make structure object with n
words.push_back(tempO); // Push structure object into words vector
}
}
return words;
}
/*********************************************
bool findWord(string, vector<WordCount>) -
Linear search for word in vector of structures
**********************************************/
bool findWord(string s, vector<WordCount>& words){
// Search through vector
for (auto& r : words){
if (r.word.compare(s) == 0){ // Increment count of object if found again
r.iCount(); return true;
}
}
}
/***********************************************
void DisplayResults(vector<WordCount>) -
Displays results.
************************************************/
void DisplayResults(vector<WordCount> words){
// TROUBLESHOOT FIRST ERASE THIS AFTER!!!!!
cout << "Word" << setw(20) << "Count\n";
cout << "-----------------------\n";
sort(words.begin(), words.end(),CompareWordCount);
for (auto &r : words){
cout << setw(6) << left << r.word;
cout << setw(15) << right << r.count << endl;
}
}
Consider using map for word count task
int main()
{
map<string, int> wordCount;
vector<string> inputWords = {"some", "test", "stuff", "test",
"stuff"}; //read from file instead
for(auto& s: inputWords)
wordCount[s]++; //wordCount itself
for(auto& entry: wordCount) //print all words and assosiated counts
cout << entry.first << " " << entry.second <<endl;
cout <<wordCount.size() <<endl; //thats number of distinct words
}

reading from text file and changing values c++

#include <iostream>
#include <fstream>
#include <iomanip> // For formatted input
#include <cctype> // For the "is" character functions
#include <cstring> // For strncpy, strncat and strlen functions
#include <cstdlib> // For exit function
using namespace std;
int main() {
ifstream fin; // Declare and name the input file stream object
ofstream fout; // Declare and name the output file stream object
char in_file[51]; // Filename for the input file
char out_file[56]; // Filename for the output file
char c; // Current character in the file
cout << "Enter the filename of the input file: ";
cin >> setw(51) >> in_file; //setting max length of file name
strncpy(out_file, in_file, 50);
strncat(out_file, ".norm", 50 - strlen(out_file));
fin.open(in_file);
if(fin.fail()) {
cout << "Cannot open " << in_file << " for reading.\n";
exit(1);
}
fout.open(out_file);
if(fout.fail()) {
cout << "Cannot open " << out_file << " for writing.\n";
exit(1);
}
while(fin.get(c))
{
/* commented this out to see if a switch statement would output differently
if (isupper(c))
{
c=tolower(c);
putchar(c);
}
if (c=='/n')
{
fout<< endl << endl;
}
if (c=='/t')
{
for(int i=0; i<9; i++)
fout<<" ";
}
*/
switch (c)
{
case '\t' : // replace 'tab' by '8 chars'
fout << " ";
break;
case '\n' : //replace 1 newline with 2
fout<<"\n"<<"\n";
break;
default: // use default case to proccess all data and
if (isupper (c)) { // test upper/lower-case.
char c2 = tolower (c);
fout << c2;
} else {
fout << c;
}
}
fin >> noskipws >> c; // read the next character
}
fin.close();
fout.close();
cout << in_file << " has been normalized into " << out_file << endl;
return(0);
}
What I'm trying to do is have some input text file, append it with .norm and output it normalized with: 1.All tabs replaced with 8 spaces, 2.All upper case to lower case, 3.Double space the text. I thought my code would accomplish this, but I'm getting really weird outputs.
Here's an example of a text input:
DOE JOHN 56 45 65 72
DOE jane 42 86 58 69
doe tom 89 92 75 86
which then was output to:
dejh 64 57
o ae4 65 9detm8 27 6
I have no idea what's going wrong and would really appreciate any help.
while(fin.get(c))
reads a character at the beginning of every iteration of the while loop. But inside the while loop body, right at the end
fin >> noskipws >> c;
reads another character. This second character will be promptly written over by while(fin.get(c)) and never be inspected.
This is shown by the OP's output: Every second character is transformed and written to the file.
Recommendation to OP: Learn to use your IDE's debugger. This was a trivial error that would have been immediately apparent if OP stepped through a few loop iterations.