How to read a message from a file, modifying only words? - c++

Suppose I have the following text:
My name is myName. I love
stackoverflow .
Hi, Guys! There is more than one space after "Guys!" 123
And also after "123" there are 2 spaces and newline.
Now I need to read this text file as it is. Need to make some actions only with alphanumeric words. And after it I have to print it with changed words but spaces and newlines and punctuations unchanged and on the same position. When changing alphanumeric words length remains same. I have tried this with library checking for alphanumeric values, but code get very messy. Is there anyother way?

You can read your file line-by-line with fgets() function. It will fill char array and you can work with this array, e.g. iterate over this array, split it into alnum words; change the words and then write fixed string into new file with "fwrite()" function.
If you prefer C++ way of working with files (iostream), you can use istream::getline. It will save spaces; but it will consume "\n". If you need to save even "\n" (it can be '\r' and '\r\n' sometimes), you can use istream::get.

Maybe you should look at Boost Tokenizer. It can break of a string into a series of tokens and iterate through them. The following sample breaks up a phrase into words:
int main()
{
std::string s = "Hi, Guys! There is more...";
boost::tokenizer<> tok(s);
for(boost::tokenizer<>::iterator beg = tok.begin(); beg != tok.end(); ++beg)
{
std::cout << *beg << "\n";
}
return 0;
}
But in your case you need to provide a TokenizerFunc that will break up a string at alphanumeric/non-alphanumeric boundaries.
For more information see Boost Tokenizer documentation and implementation of an already provided char_separator, offset_separator and escaped_list_separator.

The reason that your code got messy is usually because you didn't break down your problem in clear functions and classes. If you do, you will have a few functions that each do precisely one thing (not messy). Your main function will then just call these simple functions. If the function names are well chosen, the main function will become short and clear, too.
In this case, your main function needs to do:
Loop: Read every line of a file
On every line, check if and where a "special" word occurs.
If a special word occurs, replace it
Extra hints: a line of text can be stored as a std::string and can be read by std::getline(std::cin, line)

Related

When reading from a file in C++, can I just copy the text itself?

Sorry, the wording for the actual question is probably wrong. I have a program that reads in a line from a .txt file and then puts the string into an object to compare it to a string entered by the user. I haven't been able to get it to match, and when I've tried to see what is entered, I don't see much. Maybe there's an invisible character denoting the end of the line? I've tried code like this:
std::cout << "...." << table[row][col]->get() << "...." <<std::endl;
And got
....a
as the result. When reading the file I used std::getline() if that makes a difference.
I didn't find a true fix, although I did see that the length of the read-in string was one int longer than the actual word. I was able to use a substring to cut the end off of the string.

C++ count functional words occurrence

I'm trying to count occurrences of specific words from a text file, the problem is that when my code is reading the file - it is reading it with white-space delimiters but some of the words i want to count are "2 word words" for example "out from"
additional to this there is a second problem and that is the words like "aren't" and "don't" - my code seem to ignore this words even when i put them with backslash in the map - my guess is that it is getting ignored in the process of reading it from the file for some reason
the end outcome that i am looking for is the frequency of the words that i am searching for.
std::list<std::string> Fwords = {
"a","abroad","as far as","ahead of"};
// Begin reading from file:
std::ifstream fileStream(fileName);
// Check if we've opened the file (as we should have).
if (fileStream.is_open())
while (fileStream.good())
{
// Store the next word in the file in a local variable.
std::string word;
fileStream >> word;
std::cout << "This is the word: " << word << endl;
if (std::find(std::begin(Fwords), std::end(Fwords), word) != std::end(Fwords))
wordsCount[word]++;
}
input:
"ahead of me as far as abroad me"
this would be the expected output:
abroad:1
ahead of:1
as far as:1
This approach won't work. Your problem is that you're reading one word at a time from the file. No amount of backslashing or manipulating the list / map of words will fix that.
But how are you supposed to know how many words to read? You don't—it'll have to be trial and error.
One way to "brute force" this, considering your level of programming, would be to add an else case to
if (std::find(std::begin(Fwords), std::end(Fwords), word) != std::end(Fwords))
{
// ...
}
in which you check for words in the map that begin with the word from the file, e.g. "as," but with a space, so the search is for as . If one or more matches are found, then it's time to read another word from the file, e.g. "as far." This should be put in a loop (or a function called in a loop) so that the search for as far and reading another word "as" happens automatically. Upon successfully finding as far as, you're done. You're also done upon failure to find as , as far , or as far as, i.e. if you don't have these in your map, in which case, you want to run a for loop through each word to check if they are words by themselves, and increase their count if so. In this endeavor, you'll realize that you need the same code as your original code; so it'd be smart to factor it out into a function as well.

Differentiating between delimiter and newline in getline

ifstream file;
file.open("file.csv");
string str;
while(file.good())
{
getline(file,str,',')
if (___) // string was split from delimiter
{
[do this]
}
else // string was split from eol
{
[do that]
}
}
file.close();
I'd like to read from a csv file, and differentiate between what happens when a string is split off due to a new line and what happens when it is split off due to the desired delimiter -- i.e. filling in the ___ in the sample code above.
The approaches I can think of are:
(1) manually adding a character to the end of each line in the original file,
(2) automatically adding a character to the end of each line by writing to another file,
(3) using getline without the delimiter and then making a function to split the resulting string by ','.
But is there a simpler or direct solution?
(I see that similar questions have been asked before, but I didn't see any solutions.)
My preference for clarity of the code would be to use your option 3) - use getline() with the standard '\n' delimiter to read the file into a buffer line by line and then use a tokenizer like strtok() (if you want to work on the C level) or boost::tokenizer to parse the string you read from the file.
You're really dealing with two distinct steps here, first read the line into the buffer, then take the buffer apart to extract the components you're after. Your code should reflect that and by doing so, you're also avoiding having to deal with odd states like the ones you describe where you end up having to do additional parsing anyway.
There is no easy way to determine "which delimiter terminated the string", and it gets "consumed" by getline, so it's lost to you.
Read the line, and parse split on commas yourself. You can use std::string::find() to find commas - however, if your file contains strings that in themselves contain commas, you will have to parse the string character by character, since you need to distinguish between commas in quoted text and commas in unquoted text.
Your big problem is your code does not do what you think it does.
getline with a delimiter treats \n as just another character from my reading of the docs. It does not split on both the delimiter and newline.
The efficient way to do this is to write your oen custom splitting getline: cppreference has a pretty clear description of what getline does, mimicing it should be easy (and safer than shooting from the hip, files are tricky).
Then return both the string, and information about why you finished your parse in a second channel.
Now, using getline naively then splitting is also viable, and will be much faster to write, snd probably less error prone to boot.

Good way to tokenize a string to store values? Or alternative for user input

Hello again Stackoverflow, I'm here again asking a question for my C++ programming class. The problem I am facing is mostly to due with user input from the keyboard. I need to be able to take the user input to decide what function to call and what arguments to give the function. For example something like add 5 would call the add function with the argument 5. At first I tried overloading the >> operator to take both a string and an int but the problem I ran into was the program was unable to take input without the int such as deletemax so I had to throw that idea out. So now I am back to tokenizing the input but we are not allowed to use Boost for this program so I came up with something like this using sstream
bool out = false;
string token;
string In;
int num;
do
{
cout << "heap> ";
cin >> In;
istringstream iss(In);
while(getline(iss, token, ' '))
{
cout << token << endl; //I know this is incorrect but just not what to replace it with
}
out = ProcessCommand (token, num); //Takes string and int to call correct functions
} while (out != true);
The problem lies in that I'm not quite sure how to correctly tokenize the string so I can get 2 string and convert the second string to an int. Can anyone offer me some assistance? I would greatly appreciate it. Also if there is a better way to go about this than I am trying I would also like to hear it.
Thanks for any help you can give me.
Googling "C++ string tokenize" will get you plenty of hits, with the first hit being on Stackoverflow. But you should take a stab at it. I'm guessing it's the point of the exercise.
You said "argumentS", which suggests that commands you support take varying numbers of arguments. I'd break it down like this:
read a line from the user
split line into 'tokens' on space boundaries, store tokens in a list
based on the first token in the list, choose a command to execute
pass the list of tokens to the command, so it can validate/interpret them as arguments
The tricky part is #2. Do you know about container classes yet? You can use vector<string> to store the chunks you parse. To do the actual parsing, you iterate through the characters of the string. Skip whitespace until you find a non-whitespace character (or run out of characters). Save this position: start. Then skip non-whitespace until you find whitespace (or run out of characters). Save this position: end. Copy the substring represented between from start to end and copy that to your token list.
Working out the actual details of this, making sure you don't have off-by-on-errors, etc. is going to be challenging if you've never done it before, which I'm guessing is the point.
You don't need to read in the whole of user input all at once.
For example you could read in the first bit of user input (the operation, add or deletemax, etc). From there depending on the operation you could continue to read arguments from input (in the case of add) or begin performing the operation immediately (in the case of deletemax).
One way would be to have a std::map of function names as keys and required number of arguments as values. You'd read a line of input, get the function name and then decide whether you need aditional arguments. I'd write a function that'd return a vector of arguments extracted from a string stream or an empty vector in case the input was invalid.

How do you read a word in from a file in C++?

So I was feeling bored and decided I wanted to make a hangman game. I did an assignment like this back in high school when I first took C++. But this was before I even too geometry, so unfortunately I didn't do well in any way shape or form in it, and after the semester I trashed everything in a fit of rage.
I'm looking to make a txt document and just throw in a whole bunch of words
(ie:
test
love
hungery
flummuxed
discombobulated
pie
awkward
you
get
the
idea
)
So here's my question:
How do I get C++ to read a random word from the document?
I have a feeling #include<ctime> will be needed, as well as srand(time(0)); to get some kind of pseudorandom choice...but I haven't the foggiest on how to have a random word taken from a file...any suggestions?
Thanks ahead of time!
Here's a rough sketch, assuming that the words are separated by whitespaces (space, tab, newline, etc):
vector<string> words;
ifstream in("words.txt");
while(in) {
string word;
in >> word;
words.push_back(word);
}
string r=words[rand()%words.size()];
The operator >> used on a string will read 1 (white) space separated word from a stream.
So the question is do you want to read the file each time you pick a word or do you want to load the file into memory and then pick up the word from a memory structure. Without more information I can only guess.
Pick a Word from a file:
// Note a an ifstream is also an istream.
std::string pickWordFromAStream(std::istream& s,std::size_t pos)
{
std::istream_iterator<std::string> iter(s);
for(;pos;--pos)
{ ++iter;
}
// This code assumes that pos is smaller or equal to
// the number of words in the file
return *iter;
}
Load a file into memory:
void loadStreamIntoVector(std::istream& s,std::vector<std::string> words)
{
std::copy(std::istream_iterator<std::string>(s),
std::istream_iterator<std::string>(),
std::back_inserter(words)
);
}
Generating a random number should be easy enough. Assuming you only want psudo-random.
I would recommend creating a plain text file (.txt) in Notepad and using the standard C file APIs (fopen(), and fread()) to read from it. You can use fgets() to read each line one at a time.
Once you have your plain text file, just read each line into an array and then randomly choose an entry in the array using the method you've suggested above.