Fixing text in a .txt file - c++

I've got a homework assignment that I have no idea how to even start. The instructions are to have an input .txt file with text that contains some mistakes. I have to fix that in the output .txt file, meaning, only 1 space between words, no space before a comma/punctuation and exactly 1 space after those. Capital letters at the beginning of a sentence. It also says that I don't have to use the ASCII table, because of the fact the capital letters are coded before lower case letters?
Input text example:
jaMEs , mY neIgHBor , Is A dOcTor . he SPoke eaSIlY , CLEarly And eloQuENtly.
Output:
James, my neighbor, is a doctor. He spoke easily, clearly and eloquently.
All we did in class was go over ifstream/ofstream and inputing/changing data in a .txt file, so I have no idea where to even begin. Is there a way to solve it, so it fixes any incorrect input text, or do I have to manually change every mistake in this particular text? No need to solve it for me. An example or some tips to get me started would be greatly appreciated!

Break the problem into pieces. First, read in the data from a file. Store it however you want, probably a string, then move on to the next part. Check each character and see if it is correct. If it is, move on. If not, make it correct and then move on. When you hit the end of the input, you are done.
To check if a character is correct, you just need to check if it is and should be lower case and if it should be a character. If it should be and isn't, fix it, otherwise move on.

Inspect each character as you read it. If it's a full-stop, then remember to upcase the next alphanumeric, otherwise to downcase it. If it's a space, then just remember that you've seen a space - don't print it until you see a word character.
Something like:
#include <algorithm>
#include <cctype>
#include <iostream>
#include <iterator>
int main()
{
int(*t)(int) = std::toupper;
char const*last = "";
std::for_each(std::istreambuf_iterator<char>{std::cin},
std::istreambuf_iterator<char>{},
[&](char c){if(std::isspace(c))last=" ";
else if(std::isalnum(c=t(c)))std::cout<<last<<c,last="",t=std::tolower;
else if(c==',')std::cout<<c,last=" ";
else if(c=='.')std::cout<<c,last=" ",t=std::toupper;});
}

Related

FLUTTER - Checking if a string contains another one

I am working on an English vocabulary learning app. Some of the exercises given to the users are written quizzes. They have to translate French words into English words and vice versa.
To make the checking a little more sophisticated than just "1" or "0" (TypedWord == expectedWord), I have been working with similarities between strings and that worked well (for spelling mistakes for example).
I had also used the contains function, so that for example, if the user adds an article in front of the expected word, it doesn't consider it wrong. (Ex : Ecole (School is expected), but user writes "A school").
So I was checking with lines such as "if (typedWord.contains(word)==true) then...". It works fine for the article problem.
But it prompts another issue :
Ex : A bough --> the expected French word is "branche". If user types "une branche", it considers it correct, which is great. But if user types "débrancher" (to unplug), it considers it correct as well as the word "branche" is a part of "débrancher"...
How could I keep this from happening ? Any idea of other ways to go about it ?
I read the three proposed answers which are really interesting. The thing is that some of the words are compound.... "Ex : kitchen appliance, garden tool" etc... so then I think the "space" functions might be problematic...
In this case, separate the whole answer with the "space", then compare it with the correct word.
For an example:
User's answer: That is my school
Separate it with space, so that you will find an array of words:
that, is, my, school.
Then compare each word with your word. It will give you the correct answer.
The flutter code will be like below:
usersAnswer?.split(" ").forEach((word){
if(word == correctAnswer)
print("this is a correct answer");
});
You can split the string by space and check if the resulting array has the word you're looking for.
typedWord.split(' ').contains('debranche');
So if typedWord is 'une branchethesplit(' ') will turn it into this array: ['une', 'branche'].
Now when you check if this array contains('branche') it will check if the exact string branche exists which in this case it does and returns true.
However if it's 'une debranche' the resulting array would be: ['une', 'debranche'] and because this array has no value equal to 'branche' it will return false. Remember that when you use split it turns the string into an array and by using contains on an array it checks whether or not an item of exactly the value you provide contains exists or not, whereas in a string it checks if part of that string matches the given value or not.
You could check for whitespaces before and after the correct word: something like if (typedWord.contains(' '+word+' ')==true) then..., so that "débrancher" gets marked as wrong. This is kind of strict, though: if the sentence must be completed with some punctuation, it would be rejected by this check. You'll probably want some RegExp that allows punctuation but not whitespaces.

How to parse user input while ignoring noise words in C++?

I'm trying to write a simple text adventure game in C++. I want to allow the user to be able to type in phrases such as "GET THE DOG" where the code would ignore 'THE' and just give me the important things like 'GET' and 'DOG'. I also want the game to support movement, so another example of a phrase could be something like "MOVE TO THE LEFT" where the game would ignore 'TO' and 'THE' and only pay attention to 'MOVE' 'LEFT'.
Anyone have any tips on how to write a function to do this? I thought at first I could use getline, but the only way I think I can get that to work, is if I already know the position of the important words. My friend suggested using substr to put the strings into a vector, then iterating over that. But even that way I'm not too sure how I'd use substr to do such a thing.
Thanks!
char str[100];
cin.getline(str,100);
char* point;
pint = strtok(str, " ");
while(piont != NULL){
cout<<point<<endl;
point = strtok(NULL, " ");
}
}
here is something I've divvied up while trying to figure out how to do this. I'm not really sure why it works, but its doing something right. Its pointing to full on words, because whenever i print the pointer, its printing the word before the whitespace.
The usual approach would be to split the input up into words (probably in a std::vector<std::string>), and filter (std::remove_if) the words using a set (probably a std:: unordered_set<std::string>) of "stop words". Then you can try to make sense of what's left.
Technically, a stop word is a word so common that it is pointless to use it in a search. I don't know why they are called "stop words", but it is definitely the usual term and you can use it to find some common lists. Not all of them are "noise", in your sense, but I think all your noise words will be on common stop word lists.

Strange error printing getline() strings in cout

I was trying to test my classes when I encountered a weird problem in the input of test cases.
I tried to simplify the input to see what went wrong so I created the program below.
#include <iostream>
#include <string>
int main()
{
std::string number;
while (std::getline(std::cin, number))
{
std::cout << std::string(number) << " ";
}
}
Basically, I am getting each line of text and storing it in a string variable using getline(). Then I display each string using std::cout and append a single space character.
My input file contains this:
one six
one seven
The expected output should be like this:
one six one seven
But instead, I get this:
one seven
That is a space character followed by the second line of the input. It disregards the first line of input. I know for a fact that each line are being read properly because they were correctly displayed when I replaced the code with this:
std::cout << std::string(number) << std::endl;
This error is quite new to me. What's happening here? Can anybody explain? TIA!
Ok, its clear.
Your input file must be : one six\r\ntwo seven\r\n with normal Windows EOL.
When you read it under cygwin, you get in first read one six\r, only the \n being eaten by getline, and same one seven\r on the second line.
So you write : one six\r one seven\r (with an ending blank). But the \r alone put the curson back in first column of same line and second line erases first.
And normally the problem is not visible if you replace the ending blank by a std::eol that puts the cursor on a new line. The tab (\t) if really a special case : it put the cursor on eighth column exactly where you expect it, but by pure chance. If you invert the two lines it would be more apparent because you would see the remaining of first line at end of second.
You can confirm it by writing the output to a file and editing it.
I could reproduce it under Linux with a Windows EOL. The reason for that is that Cygwin closely mimics Unix-Linux and use Unix EOL convention of only \n.

C++: File input stream and skipping to next line?

I'm trying to write a program that takes inputData from two files for a season of some sport (i.e.: football) and writes an output listing rankings each week. In the input file with the scores for each game, every week is separated by a line of '-' characters. I have an if, else loop set up where the program peeks at the first character of each line. If it sees a character other than '-', it reads normally. However, when it reads '-', the program will begin the output cycle.
The thing is, being that this is peek, I need to figure out how to get to the next line without creating new input and not cause a crash. All I can think of is using inStream.find( !'-' ); or inStream.seekg( !'-' );. Are there any other options I can use?
Also, for reference, the code is listed here: https://coderpad.io/475356. Look for line 80 for the problem area. Just don't make any edits please.
Thank you for your time.
P.S.: If anyone can find any other crashes, though, feel free to mention it.
How about just using ignore() to skip the line?
inStream.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
Make sure you have:
#include <limits>
If you prefer not to have std::, just put using std::numeric_limits; at the top of your file, and then drop std:: from the expression above.

How to read a message from a file, modifying only words?

Suppose I have the following text:
My name is myName. I love
stackoverflow .
Hi, Guys! There is more than one space after "Guys!" 123
And also after "123" there are 2 spaces and newline.
Now I need to read this text file as it is. Need to make some actions only with alphanumeric words. And after it I have to print it with changed words but spaces and newlines and punctuations unchanged and on the same position. When changing alphanumeric words length remains same. I have tried this with library checking for alphanumeric values, but code get very messy. Is there anyother way?
You can read your file line-by-line with fgets() function. It will fill char array and you can work with this array, e.g. iterate over this array, split it into alnum words; change the words and then write fixed string into new file with "fwrite()" function.
If you prefer C++ way of working with files (iostream), you can use istream::getline. It will save spaces; but it will consume "\n". If you need to save even "\n" (it can be '\r' and '\r\n' sometimes), you can use istream::get.
Maybe you should look at Boost Tokenizer. It can break of a string into a series of tokens and iterate through them. The following sample breaks up a phrase into words:
int main()
{
std::string s = "Hi, Guys! There is more...";
boost::tokenizer<> tok(s);
for(boost::tokenizer<>::iterator beg = tok.begin(); beg != tok.end(); ++beg)
{
std::cout << *beg << "\n";
}
return 0;
}
But in your case you need to provide a TokenizerFunc that will break up a string at alphanumeric/non-alphanumeric boundaries.
For more information see Boost Tokenizer documentation and implementation of an already provided char_separator, offset_separator and escaped_list_separator.
The reason that your code got messy is usually because you didn't break down your problem in clear functions and classes. If you do, you will have a few functions that each do precisely one thing (not messy). Your main function will then just call these simple functions. If the function names are well chosen, the main function will become short and clear, too.
In this case, your main function needs to do:
Loop: Read every line of a file
On every line, check if and where a "special" word occurs.
If a special word occurs, replace it
Extra hints: a line of text can be stored as a std::string and can be read by std::getline(std::cin, line)