Why is this char stopping my program? - c++

Does the newline character have some kind of special significance in c++? Is it a non-ASCII character?
I'm trying to build a Markov chain for each unique n-character substring within a larger piece of text. Every time I come across a new unique substring I enter it into a map whose value is a 256-element vector (one element for each character in the extended ASCII table).
There's no problem when I print out the entire contents of the file ("lines" is a vector of lines of text built using ifstream and getline):
for(int i=0; i<lines.size(); i++) cout << lines[i] << endl;
The whole text file shows up in the console. The problem happens when I try to return the newline character to a function that's expecting a char. "moveSpaces" is an integer constant that determines how many characters further ahead to move in the vector of strings on each iteration.
char GetNextChar(int row, int col){
for (int i=0; i<MOVESPACES; i++) {
if (col+1<lines[row].size()) {
col+=1;
} else { // If you're not at the end of the line keep going
row+=1; // Otherwise, move to the beginning of the next row
col=0;
}
}
return lines[row].at(col);
}
I've walked through with the debugger, and when it gets to the 1st column of the 2nd line it craps out on me – no error or anything. It fails within this function, not the calling function.
The file I'm using is A Christmas Carol (first thing that came up on Project Gutenberg). For reference here are the first few lines:
STAVE I: MARLEY'S GHOST
MARLEY was dead: to begin with. There is no doubt
whatever about that. The register of his burial was
The function breaks when it should return the first character on the second line. This doesn't happen if I get rid of the newline, or if I build the "lines" vector myself line by line in the program. Any idea what's wrong?

Your GetNextChar function is assuming that if you are at the last character in some line, there will be a character in the next line. What happens if there is no character in that next line? This can happen in two places: When you have hit end of file, or when the next line is the empty string.
The second line is the empty string.

Related

Formatting Output c++

Wanting to do some fancy formatting. I have several lines that I want to interact with each other. Get the first two lines. Print out the character in the second line times the integer in the first line. Seperate them all with a asterisk character. No asterisk after the final character is printed. Move onto the next integer and character. Print them on a separate line. Do this for the whole list. The problem I am having is printing them on separate lines. Example:
5
!
2
?
3
#
Desired output:
!*!*!*!*!
?*?
#*#*#
My output:
!*!*!*!*!*?*?*#*#*#*
Below is my code. Another thing to mention is that I am reading the data about the characters and numbers from a separate text file. So I am using the getline function.
Here is a chunk of the code:
ifstream File
File.open("NumbersAndCharacters.txt")
string Number;
string Character;
while(!File.eof(){
getline(File, Number);
getline(File, Character);
//a few lines of stringstream action
for (int i=0; i<=Number; i++){
cout<<Character<<"*";}//end for. I think this is where
//the problem is.
}//end while
File.close();
return 0;
Where is the error? Is it the loop? Or do I not understand getline?
It should be printing an "endl" or "\n" after each multiplication of the character is done.
Thanks to everyone for the responses!
You have not shown your code yet, but what seems to be the issue here is that you simply forgot to add a new line every time you print your characters. For example, you probably have done:
std::cout << "!";
Well, in this context you forgot to add the new line ('\n'), so you have two options here: first insert the new line yourself:
std::cout << "! \n";
Or std::endl;
std::cout << "!" << std::endl;
For comparison of the two, see here and here. Without further description, or more importantly your code that doesn't seem to work properly, we can't make suggestions or solve your problem.

C++ cout char 'return' character from file appears twice

I'm trying to create a program that encrypts files based on how Nazi Germany's Enigma machine worked, but without the flaw :P.
I have a function that gets a character at n point in a file, but when it returns a return character and I cout << it, it's like it hit enter twice.
IE if I loop cout-ing from i++ points in a file the individual lines in the terminal appear separated
by more returns
than one.
Here's the function:
char charN(string pathOf, int pointIn){
char r = NULL;
// NULL so I can tell when it doesn't return a character.
int sizeOf; //to store the found size of the file.
ifstream cf; //to store the Character Found.
ifstream siz; //used later to get the size of the file
siz.open(pathOf.c_str());
siz.seekg(0, std::ios::end);
sizeOf = siz.tellg(); // these get the length of the file and put it in sizeOf.
cf.open(pathOf.c_str());
if(cf.is_open() && pointIn < sizeOf){ //if not open, or if the character to get is farther out than the size of the file, let the function return the error condition: 'NULL'.
cf.seekg(pointIn); // move to the point in the file where the character should be, get it, and get out.
cf.get(r);
cf.close();
}
return r;
}
It works correctly if I use cout << '\n', but what's different about returns from a file and '\n'?
Or is there something else I'm missing?
I've been googling about but I can't find anything remotely similar to my problem, thanks in advance.
I'm using Code::Blocks 13.12 as my compiler if that matters.
Is this is on a windows machine? In windows new lines in text files are representing by \r\n.
\r = carriage return
\n = line feed
It's possible that you are couting each one separately and that the output buffer is creating a new line for each one.

seekg() not working as expected

I have a small program, that is meant to copy a small phrase from a file, but it appears that I am either misinformed as to how seekg() works, or there is a problem in my code preventing the function from working as expected.
The text file contains:
//Intro
previouslyNoted=false
The code is meant to copy the word "false" into a string
std::fstream stats("text.txt", std::ios::out | std::ios::in);
//String that will hold the contents of the file
std::string statsStr = "";
//Integer to hold the index of the phrase we want to extract
int index = 0;
//COPY CONTENTS OF FILE TO STRING
while (!stats.eof())
{
static std::string tempString;
stats >> tempString;
statsStr += tempString + " ";
}
//FIND AND COPY PHRASE
index = statsStr.find("previouslyNoted="); //index is equal to 8
//Place the get pointer where "false" is expected to be
stats.seekg(index + strlen("previouslyNoted=")); //get pointer is placed at 24th index
//Copy phrase
stats >> previouslyNotedStr;
//Output phrase
std::cout << previouslyNotedStr << std::endl;
But for whatever reason, the program outputs:
=false
What I expected to happen:
I believe that I placed the get pointer at the 24th index of the file, which is where the phrase "false" begins. Then the program would've inputted from that index onward until a space character would have been met, or the end of the file would have been met.
What actually happened:
For whatever reason, the get pointer started an index before expected. And I'm not sure as to why. An explanation as to what is going wrong/what I'm doing wrong would be much appreciated.
Also, I do understand that I could simply make previouslyNotedStr a substring of statsStr, starting from where I wish, and I've already tried that with success. I'm really just experimenting here.
The VisualC++ tag means you are on windows. On Windows the end of line takes two characters (\r\n). When you read the file in a string at a time, this end-of-line sequence is treated as a delimiter and you replace it with a single space character.
Therefore after you read the file you statsStr does not match the contents of the file. Every where there is a new line in the file you have replaced two characters with one. Hence when you use seekg to position yourself in the file based on numbers you got from the statsStr string, you end up in the wrong place.
Even if you get the new line handling correct, you will still encounter problems if the file contains two or more consecutive white space characters, because these will be collapsed into a single space character by your read loop.
You are reading the file word by word. There are better methods:
while (getline(stats, statsSTr)
{
// An entire line is read into statsStr.
std::string::size_type posn = statsStr.find("previouslyNoted=");
// ...
}
By reading entire text lines into a string, there is no need to reposition the file.
Also, there is a white-space issue when reading by word. This will affect where you think the text is in the file. For example, white space is skipped, and there is no telling how many spaces, newlines or tabs were skipped.
By the way, don't even think about replacing the text in the same file. Replacement of text only works if the replacement text has the same length as the original text in the file. Write to a new file instead.
Edit 1:
A better method is to declare your key strings as array. This helps with positioning pointers within a string:
static const char key_text[] = "previouslyNoted=";
while (getline(stats, statsStr))
{
std::string::size_type key_position = statsStr.find(key_text);
std::string::size_type value_position = key_position + sizeof(key_text) - 1; // for the nul terminator.
// value_position points to the character after the '='.
// ...
}
You may want to save programming type by making your data file conform to an existing format, such as INI or XML, and using appropriate libraries to parse them.

Count the number of unique words and occurrence of each word

CSCI-15 Assignment #2, String processing. (60 points) Due 9/23/13
You MAY NOT use C++ string objects for anything in this program.
Write a C++ program that reads lines of text from a file using the ifstream getline() method, tokenizes the lines into words ("tokens") using strtok(), and keeps statistics on the data in the file. Your input and output file names will be supplied to your program on the command line, which you will access using argc and argv[].
You need to count the total number of words, the number of unique words, the count of each individual word, and the number of lines. Also, remember and print the longest and shortest words in the file. If there is a tie for longest or shortest word, you may resolve the tie in any consistent manner (e.g., use either the first one or the last one found, but use the same method for both longest and shortest). You may assume the lines comprise words (contiguous lower-case letters [a-z]) separated by spaces, terminated with a period. You may ignore the possibility of other punctuation marks, including possessives or contractions, like in "Jim's house". Lines before the last one in the file will have a newline ('\n') after the period. In your data files, omit the '\n' on the last line. You may assume that the lines will be no longer than 100 characters, the individual words will be no longer than 15 letters and there will be no more than 100 unique words in the file.
Read the lines from the input file, and echo-print them to the output file. After reaching end-of-file on the input file (or reading a line of length zero, which you should treat as the end of the input data), print the words with their occurrence counts, one word/count pair per line, and the collected statistics to the output file. You will also need to create other test files of your own. Also, your program must work correctly with an EMPTY input file – which has NO statistics.
Test file looks like this (exactly 4 lines, with NO NEWLINE on the last line):
the quick brown fox jumps over the lazy dog.
now is the time for all good men to come to the aid of their party.
all i want for christmas is my two front teeth.
the quick brown fox jumps over a lazy dog.
Copy and paste this into a small file for one of your tests.
Hints:
Use a 2-dimensional array of char, 100 rows by 16 columns (why not 15?), to hold the unique words, and a 1-dimensional array of ints with 100 elements to hold the associated counts. For each word, scan through the occupied lines in the array for a match (use strcmp()), and if you find a match, increment the associated count, otherwise (you got past the last word), add the word to the table and set its count to 1.
The separate longest word and the shortest word need to be saved off in their own C-strings. (Why can't you just keep a pointer to them in the tokenized data?)
Remember – put NO NEWLINE at the end of the last line, or your test for end-of-file might not work correctly. (This may cause the program to read a zero-length line before seeing end-of-file.)
This is not a long program – no more than about 2 pages of code
Here is what I have so far:
#include<iostream>
#include<iomanip>
#include<fstream>
#include<string>
#include<cstring>
using namespace std;
void totalwordCount(ifstream &inputFile)
{
char words[100][16]; // Holds the unique words.
char *token;
int totalCount = 0; // Counts the total number of words.
// Read every word in the file.
while(inputFile >> words[99])
{
totalCount++; // Increment the total number of words.
// Tokenize each word and remove spaces, periods, and newlines.
token = strtok(words[99], " .\n");
while(token != NULL)
{
token = strtok(NULL, " .\n");
}
}
cout << "Total number of words in file: " << totalCount << endl;
}
void uniquewordCount(ifstream &inputFile)
{
char words[100][16]; // Holds the unique words
int counter[100];
char *tok = "0";
int uniqueCount = 0; // Counts the total number of unique words
while(!inputFile.eof())
{
uniqueCount++;
tok = strtok(words[99], " .\n");
while(tok != NULL)
{
tok = strtok(NULL, " .\n");
inputFile >> words[99];
if(strcmp(tok, words[99]) == 0)
{
counter[99]++;
}
else
{
words[99][15] += 1;
}
uniqueCount++;
}
}
cout << counter[99] << endl;
}
int main(int argc, char *argv[])
{
ifstream inputFile;
char inFile[12] = "string1.txt";
char outFile[16] = "word result.txt";
// Get the name of the file from the user.
cout << "Enter the name of the file: ";
cin >> inFile;
// Open the input file.
inputFile.open(inFile);
// If successfully opened, process the data.
if(inputFile)
{
while(!inputFile.eof())
{
totalwordCount(inputFile);
uniquewordCount(inputFile);
}
}
return 0;
}
I already took care of how to count the total number of words in the file in the totalwordCount() function, but in the uniquewordCount() function, I am having trouble counting the total number of unique words and counting the number of occurrences of each word. Is there anything that I need to change in the uniquewordCount() function?
This program contains several issues which are to be considered harmful! To prevent bad software being created based on entirely nonsensical assignments like the above, here are a number of hints:
Always test the stream for success after reading from it. Using in.eof() to determine if the stream is in a good state does not work! One of the problems is that you will get an infinite loop if the stream goes bad for a different reason than end of file, e.g., failure to correctly parse a value (this will set std::ios_base::failbit but not std::ios_base::eofbit.
Reading to a fixed size char array a using in >> a without having set up limits for the number of characters to be read is the C++ way to spell gets()! If you really think that using in >> a is the right way to (see next item), you absolutely need to set up the array's width, e.g., using in >> std::setw(sizeof(a)) >> a. You still need to check that this extraction was successful, of course.
From the looks of it, your teacher wants you to actually use std::istream::getline() to read the array, e.g., using in.getline(a, sizeof(a)) (which, of course, needs to be checked for success).
Note that the formatted input, i.e., in >> a already tokenizes the stream being received by spaces! There is no need to faff about with strtok() after that.
Once you have consumed a stream, it is consumed. Assuming the characters don't come from a file but rather from something like standard input, you also can't rewind the stream to read it again. I'd think you want to tokenize the values once and use them for both purposes.
This is more of a sidenote: after you created a stream, its nature should be entirely immaterial for the processing of the stream's content (although, e.g., for string streams you might want to eventually collect the result using the str() member): implement your stream processing functions in terms of std::istream rather than std::ifstream!
Since you have a concrete question ("Is there anything that I need to change in the uniquewordCount() function?"): yes, everything! Throw away this function entirely and rethink what you need to do. Basically, the structure of the functionality should be along the lines of
char buffer[100];
while (in.getline(buffer, sizeof(buffer))) {
// tokenize buffer into words
// for each word check if it already exists
// if the word does not exist, append it to the array of known words and set count to 1
// if the word exists, increment the count
// determine if the word is shorter or longer than the shortest or longest word so far
// if it is the case, remember the word's index or a pointer to it
}

Using cin.get() to grab a line of text, then using it in a loop to display that line?

Ok, so I came across this code snippet in my textbook that's supposed to echo every other character a user types in. Now, I understand the every other character part, but I'm having difficulty with the use of cin.get(). I understand why the first cin.get() is there, but why is it also inside the loop? I'm guessing I'm not fully grasping the nature of input streams...
EDIT: It just clicked... I'm an idiot. Thanks for clearing that up.
char next;
int count = 0;
cout << "Enter a line of input:\n";
cin.get(next);
while (next != '\n')
{
if ((count%2) == 0)
cout << next;
count++;
cin.get(next);
}
Thanks in advance!
cin.get in this case does not "grab a line of text" as you seem to believe. cin.get in this case grabs just a single character. With cin.get you read characters the user is typing in, one by one, one after another. This is why you have cin.get in a loop.
The call to cin.get(next) that comes before the loop is only placing the first character of buffered user input into the variable 'next.'
Once inside the loop, and the character stored in 'next' has been processed (echoed if at an even index, otherwise ignored), cin.get(next) needs to be called again to retrieve the next character.
Its printing characters present at even positions
char next;
int count = 0;
cout << "Enter a line of input:\n";
cin.get(next);//gets first character (position zero) from input stream
while (next != '\n')//check whether the character is not line feed(End of the line)
{
if ((count%2) == 0)//checks if position is even
cout << next; //if yes print that letter
count++; //increments the count
cin.get(next); //gets next character from input stream
}
We require two cin.get(...)
before entering the while loop we need to know first character(position zero)
inside while loop for getting next character
but what is the use of outside cin.get(ch) what does it do
how cin.get() works inside loop
both behaviours are lookin different
so it's making confusing
there is a statement in loop to print the character got using cin.get(next) but it will not print it.... it will print all together after pressing enter key ... actually it should display the characters as we type from the keyboard but it is not actually working like that
istream& get(char &c) gets a character from the input stream.
so on the first call cin.get(next) you typed:
"hello world!"
Then future cin.get(next) will fetch h, e, l, l, etc... on every call until the there are no more characters in the input stream and that's when it will block asking the user for more input.
Streams in C++ , are buffered. Think of them as a line of letters. When you call cin.get(var) the first character in that line is removed and returned to you. So, that's how it works.
An example would help better. When the first cin.get() is executed, let's say you type in :
LISP
and then, cin.get() will return (in the var.) L and then the buffer will look like ISP... the next call will place I in the var. and the buffer will look like SP and so on...