Detecting space in a file in c++ - c++

Hi i was just wondering if anybody could help me i am reading characters from a file then inserting them into a map i have the code working i was just wondering how do i detect if a space is in the file cause i need to store the amount of times a space occurred in a file any help would be great thanks.
map<char, int> treeNodes; //character and the frequency
ifstream text("test.txt");
while(!text.eof())
{
text >> characters;
//getline(text,characters);
cout << characters;
if(treeNodes.count(characters) == 0)
{
if(isspace (characters))
{
cout << "space" << endl;
}
else
treeNodes.insert(pair<char,int>(characters,1));
}
else
{
treeNodes[characters] += 1;
}
}

Formatted input, i.e. when using the right shift operator>>() skips leading whitespace by default. You can turn this off using std::noskipws but depending on what sort of things you want to read it won't be a very happy experience. The best approach is probably using unformatted input, i.e. something like std::getline() and split the line on space within the program.
If you just want to count the number of times any particular character occurred, you probably want to use std::istreambuf_iterator<char> and just iterate over the content of the stream (this code also omits some other unnecessary clutter):
for (std::istreambuf_iterator<char> it(text), end(); it != end; ++it) {
++treeNodes[*it];
}
BTW, you never want to use the result of eof() for something different than determining whether the last read failed because the stream has reached its end.

couldn't you just cast the char to an int and test if it is equal to the ascii value of a space?

Related

How to read a complex input with istream&, string& and getline in c++?

I am very new to C++, so I apologize if this isn't a good question but I really need help in understanding how to use istream.
There is a project I have to create where it takes several amounts of input that can be on one line or multiple and then pass it to a vector (this is only part of the project and I would like to try the rest on my own), for example if I were to input this...
>> aaa bb
>> ccccc
>> ddd fff eeeee
Makes a vector of strings with "aaa", "bb", "ccccc", "ddd", "fff", "eeeee"
The input can be a char or string and the program stops asking for input when the return key is hit.
I know getline() gets a line of input and I could probably use a while loop to try and get the input such as...(correct me if I'm wrong)
while(!string.empty())
getline(cin, string);
However, I don't truly understand istream and it doesn't help that my class has not gone over pointers so I don't know how to use istream& or string& and pass it into a vector. On the project description, it said to NOT use stringstream but use functionality from getline(istream&, string&). Can anyone give somewhat of a detailed explanation as to how to make a function using getline(istream&, string&) and then how to use it in the main function?
Any little bit helps!
You're on the right way already; solely, you'd have to pre-fill the string with some dummy to enter the while loop at all. More elegant:
std::string line;
do
{
std::getline(std::cin, line);
}
while(!line.empty());
This should already do the trick reading line by line (but possibly multiple words on one line!) and exiting, if the user enters an empty line (be aware that whitespace followed by newline won't be recognised as such!).
However, if anything on the stream goes wrong, you'll be trapped in an endless loop processing previous input again and again. So best check the stream state as well:
if(!std::getline(std::cin, line))
{
// this is some sample error handling - do whatever you consider appropriate...
std::cerr << "error reading from console" << std::endl;
return -1;
}
As there might be multiple words on a single line, you'd yet have to split them. There are several ways to do so, quite an easy one is using an std::istringstream – you'll discover that it ressembles to what you likely are used to using std::cin:
std::istringstream s(line);
std::string word;
while(s >> word)
{
// append to vector...
}
Be aware that using operator>> ignores leading whitespace and stops after first trailing one (or end of stream, if reached), so you don't have to deal with explicitly.
OK, you're not allowed to use std::stringstream (well, I used std::istringstream, but I suppose this little difference doesn't count, does it?). Changes matter a little, it gets more complex, on the other hand, we can decide ourselves what counts as words an what as separators... We might consider punctuation marks as separators just like whitespace, but allow digits to be part of words, so we'd accept e. g. ab.7c d as "ab", "7c", "d":
auto begin = line.begin();
auto end = begin;
while(end != line.end()) // iterate over each character
{
if(std::isalnum(static_cast<unsigned char>(*end)))
{
// we are inside a word; don't touch begin to remember where
// the word started
++end;
}
else
{
// non-alpha-numeric character!
if(end != begin)
{
// we discovered a word already
// (i. e. we did not move begin together with end)
words.emplace_back(begin, end);
// ('words' being your std::vector<std::string> to place the input into)
}
++end;
begin = end; // skip whatever we had already
}
}
// corner case: a line might end with a word NOT followed by whitespace
// this isn't covered within the loop, so we need to add another check:
if(end != begin)
{
words.emplace_back(begin, end);
}
It shouldn't be too difficult to adjust to different interpretations of what is a separator and what counts as word (e. g. std::isalpha(...) || *end == '_' to detect underscore as part of words, but digits not). There are quite a few helper functions you might find useful...
You could input the value of the first column, then call functions based on the value:
void Process_Value_1(std::istream& input, std::string& value);
void Process_Value_2(std::istream& input, std::string& value);
int main()
{
// ...
std::string first_value;
while (input_file >> first_value)
{
if (first_value == "aaa")
{
Process_Value_1(input_file, first_value);
}
else if (first_value = "ccc")
{
Process_Value_2(input_file, first_value);
}
//...
}
return 0;
}
A sample function could be:
void Process_Value_1(std::istream& input, std::string& value)
{
std::string b;
input >> b;
std::cout << value << "\t" << b << endl;
input.ignore(1000, '\n'); // Ignore until newline.
}
There are other methods to perform the process, such as using tables of function pointers and std::map.

Simple C++ not reading EOF

I'm having a hard time understanding why while (cin.get(Ch)) doesn't see the EOF. I read in a text file with 3 words, and when I debug my WordCount is at 3 (just what I hoped for). Then it goes back to the while loop and gets stuck. Ch then has no value. I thought that after the newline it would read the EOF and break out. I am not allowed to use <fstream>, I have to use redirection in DOS. Thank you so much.
#include <iostream>
using namespace std;
int main()
{
char Ch = ' ';
int WordCount = 0;
int LetterCount = 0;
cout << "(Reading file...)" << endl;
while (cin.get(Ch))
{
if ((Ch == '\n') || (Ch == ' '))
{
++WordCount;
LetterCount = 0;
}
else
++LetterCount;
}
cout << "Number of words => " << WordCount << endl;
return 0;
}
while (cin >> Ch)
{ // we get in here if, and only if, the >> was successful
if ((Ch == '\n') || (Ch == ' '))
{
++WordCount;
LetterCount = 0;
}
else
++LetterCount;
}
That's the safe, and common, way to rewrite your code safely and with minimal changes.
(Your code is unusual, trying to scan all characters and count whitespace and newlines. I'll give a more general answer to a slightly different question - how to read in all the words.)
The safest way to check if a stream is finished if if(stream). Beware of if(stream.good()) - it doesn't always work as expected and will sometimes quit too early. The last >> into a char will not take us to EOF, but the last >> into an int or string will take us to EOF. This inconsistency can be confusing. Therefore, it is not correct to use good(), or any other test that tests EOF.
string word;
while(cin >> word) {
++word_count;
}
There is an important difference between if(cin) and if(cin.good()). The former is the operator bool conversion. Usually, in this context, you want to test:
"did the last extraction operation succeed or fail?"
This is not the same as:
"are we now at EOF?"
After the last word has been read by cin >> word, the string is at EOF. But the word is still valid and contains the last word.
TLDR: The eof bit is not important. The bad bit is. This tells us that the last extraction was a failure.
The Counting
The program counts newline and space characters as words. In your file contents "this if fun!" I see two spaces and no newline. This is consistent with the observed output indicating two words.
Have you tried looking at your file with a hex editor or something similar to be sure of the exact contents?
You could also change your program to count one more word if the last character read in the loop was a letter. This way you don't have to have newline terminated input files.
Loop Termination
I have no explanation for your loop termination issues. The while-condition looks fine to me. istream::get(char&) returns a stream reference. In a while-condition, depending on the C++ level your compiler implements, operator bool or operator void* will be applied to the reference to indicate if further reading is possible.
Idiom
The standard idiom for reading from a stream is
char c = 0;
while( cin >> c )
process(c);
I do not deviate from it without serious reason.
you input file is
this is fun!{EOF}
two spaces make WordCount increase to 2
and then EOF, exit loop! if you add a new line, you input file is
this is fun!\n{EOF}
I took your program loaded it in to visual studio 2013, changed cin to an fstream object that opened a file called stuff.txt which contains the exact characters "This is fun!/n/r" and the program worked. As previous answers have indicated, be careful because if there's not a /n at the end of the text the program will miss the last word. However, I wasn't able to replicate the application hanging in an infinite loop. The code as written looks correct to me.
cin.get(char) returns a reference to an istream object which then has it's operator bool() called which returns false when any of the error bits are set. There are some better ways to write this code to deal with other error conditions... but this code works for me.
In your case, the correct way to bail out of the loop is:
while (cin.good()) {
char Ch = cin.get();
if (cin.good()) {
// do something with Ch
}
}
That said, there are probably better ways to do what you're trying to do.

Pull out data from a file and store it in strings in C++

I have a file which contains records of students in the following format.
Umar|Ejaz|12345|umar#umar.com
Majid|Hussain|12345|majid#majid.com
Ali|Akbar|12345|ali#geeks-inn.com
Mahtab|Maqsood|12345|mahtab#myself.com
Juanid|Asghar|12345|junaid#junaid.com
The data has been stored according to the following format:
firstName|lastName|contactNumber|email
The total number of lines(records) can not exceed the limit 100. In my program, I've defined the following string variables.
#define MAX_SIZE 100
// other code
string firstName[MAX_SIZE];
string lastName[MAX_SIZE];
string contactNumber[MAX_SIZE];
string email[MAX_SIZE];
Now, I want to pull data from the file, and using the delimiter '|', I want to put data in the corresponding strings. I'm using the following strategy to put back data into string variables.
ifstream readFromFile;
readFromFile.open("output.txt");
// other code
int x = 0;
string temp;
while(getline(readFromFile, temp)) {
int charPosition = 0;
while(temp[charPosition] != '|') {
firstName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
lastName[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != '|') {
contactNumber[x] += temp[charPosition];
charPosition++;
}
while(temp[charPosition] != endl) {
email[x] += temp[charPosition];
charPosition++;
}
x++;
}
Is it necessary to attach null character '\0' at the end of each string? And if I do not attach, will it create problems when I will be actually implementing those string variables in my program. I'm a new to C++, and I've come up with this solution. If anybody has better technique, he is surely welcome.
Edit: Also I can't compare a char(acter) with endl, how can I?
Edit: The code that I've written isn't working. It gives me following error.
Segmentation fault (core dumped)
Note: I can only use .txt file. A .csv file can't be used.
There are many techniques to do this. I suggest searching StackOveflow for "[C++] read file" to see some more methods.
Find and Substring
You could use the std::string::find method to find the delimiter and then use std::string::substr to return a substring between the position and the delimiter.
std::string::size_type position = 0;
positition = temp.find('|');
if (position != std::string::npos)
{
firstName[x] = temp.substr(0, position);
}
If you don't terminate a a C-style string with a null character there is no way to determine where the string ends. Thus, you'll need to terminate the strings.
I would personally read the data into std::string objects:
std::string first, last, etc;
while (std::getline(readFromFile, first, '|')
&& std::getline(readFromFile, last, '|')
&& std::getline(readFromFile, etc)) {
// do something with the input
}
std::endl is a manipulator implemented as a function template. You can't compare a char with that. There is also hardly ever a reason to use std::endl because it flushes the stream after adding a newline which makes writing really slow. You probably meant to compare to a newline character, i.e., to '\n'. However, since you read the string with std::getline() the line break character will already be removed! You need to make sure you don't access more than temp.size() characters otherwise.
Your record also contains arrays of strings rather than arrays of characters and you assign individual chars to them. You either wanted to yse char something[SIZE] or you'd store strings!

c++ change space to enter

I don't know if this is even possible.
I have an assignment to translate words and phrases into pig Latin in C++. the fastest way to do this would be to have the user hit enter after each word, but this would make entering a continuous phrase impossible without hitting enter instead of the space bar.
your
text
would
be
entered
like
this
The your output could easily be:
youway exttay ouldway ebay enteredway ikelay histay
But still putting the info in would be weird.
Instead I would like to force the program to treat the space bar as though it were the enter key (carriage return).
your text would be entered like this
That way each word would enter my array separately from the string, the user only having to hit enter 1 time.
You could do something like:
Read a line of text from user input (which may have multiple words)
Split the line into words
Translate each word into Pig Latin
Print the words out with spaces between them
Rather than thinking of this in terms of "how can I change these keys to mean something else", think of it in terms of "how can I best work with what the user is expecting to type". If the user is expecting to type spaces between words (makes sense), then design your program so that it can handle that kind of input.
You can have the user input data as a single line, since that seems natural.
If you want some help in parsing the words to operate on the one at a time, then try this other question.
Here's the cheap-o way to do it:
std::string in;
while (std::cin >> in)
std::cout << piglatin(in) << char(std::cin.get());
std::cin >> in skips any leading whitespace in the input stream, and then fills in with the next whitespace-terminated word from the input stream, leaving the whitespace termination in the input stream. char(std::cin.get()) then extracts that terminator (which might be a space or a new line). The while loop is terminated by an end-of-file.
You can use that provided you understand it.
Added:
Here's a better way to find whether the word read was terminated with a space or a new-line:
#include <cctype>
char look_for_nl(std::istream& is) {
for (char d = is.get(); is; d = is.get()) {
if (d == '\n') return d;
if (!isspace(d)) {
is.putback(d);
return ' ';
}
}
// We got an eof and there was no NL character. We'll pretend we saw one
return '\n';
}
Now the hack looks like this:
std::string in;
while (std::cin >> in)
std::cout << piglatin(in) << look_for_nl(std::cin);

How to know if the next character is EOF in C++

I'm need to know if the next char in ifstream is the end of file. I'm trying to do this with .peek():
if (file.peek() == -1)
and
if (file.peek() == file.eof())
But neither works. There's a way to do this?
Edit: What I'm trying to do is to add a letter to the end of each word in a file. In order to do so I ask if the next char is a punctuation mark, but in this way the last word is left without an extra letter. I'm working just with char, not string.
istream::peek() returns the constant EOF (which is not guaranteed to be equal to -1) when it detects end-of-file or error. To check robustly for end-of-file, do this:
int c = file.peek();
if (c == EOF) {
if (file.eof())
// end of file
else
// error
} else {
// do something with 'c'
}
You should know that the underlying OS primitive, read(2), only signals EOF when you try to read past the end of the file. Therefore, file.eof() will not be true when you have merely read up to the last character in the file. In other words, file.eof() being false does not mean the next read operation will succeed.
This should work:
if (file.peek(), file.eof())
But why not just check for errors after making an attempt to read useful data?
file.eof() returns a flag value. It is set to TRUE if you can no longer read from file. EOF is not an actual character, it's a marker for the OS. So when you're there - file.eof() should be true.
So, instead of if (file.peek() == file.eof()) you should have if (true == file.eof()) after a read (or peek) to check if you reached the end of file (which is what you're trying to do, if I understand correctly).
For a stream connected to the keyboard the eof condition is that I intend to type Ctrl+D/Ctrl+Z during the next input.
peek() is totally unable to see that. :-)
Usually to check end of file I used:
if(cin.fail())
{
// Do whatever here
}
Another such way to implement that would be..
while(!cin.fail())
{
// Do whatever here
}
Additional information would be helpful so we know what you want to do.
There is no way of telling if the next character is the end of the file, and trying to do so is one of the commonest errors that new C and C++ programmers make, because there is no end-of-file character in most operating systems. What you can tell is that reading past the current position in a stream will read past the end of file, but this is in general pretty useless information. You should instead test all read operations for success or failure, and act on that status.
You didn't show any code you are working with, so there is some guessing on my part. You don't usually need low level facilities (like peek()) when working with streams. What you probably interested in is istream_iterator. Here is an example,
cout << "enter value";
for(istream_iterator<double> it(cin), end;
it != end; ++it)
{
cout << "\nyou entered value " << *it;
cout << "\nTry again ...";
}
You can also use istreambuf_iterator to work on buffer directly:
cout << "Please, enter your name: ";
string name;
for(istreambuf_iterator<char> it(cin.rdbuf()), end;
it != end && *it != '\n'; ++it)
{
name += *it;
}
cout << "\nyour name is " << name;
just use this code in macosx
if (true == file.eof())
it work for me in macosx!