why does char[1] read entire word from my input file? - c++

this is what I have done till now: I want to read words from file in C++ and I am allowed to use only cstring library. this is my piece of code
#include <cstring>
#include <fstream>
#include <stdio.h>
using namespace std;
int main(){
ifstream file;
char word[1];
file.open("p.txt");
while (!file.eof()){
file >> word;
cout << word << endl;
}
system("pause");
return 0;
}
It is working fine and reading one word at a time. But I don't understand how this is working fine.
How can char array of any size be it char word[1] or char word[50] read only one word at a time ignoring spaces.
And further I want to store these words in dynamic array. How can I achieve this? Any guidance would be appreciated?

Your code has undefined behaviour. operator >> simply overwrites memory beyond the array.
Take into account that included by you header <stdio.h> is not used in the program. On the other hand you need to include header <cstdlib> that declares function system.
As for your second question then you should use for example standard container std::vector<std::string>
For example
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cstdlib>
int main()
{
std::ifstream file("p.txt");
std::string s;
std::vector<std::string> v;
v.reserve( 100 );
while ( file >> s ) v.push_back( s );
std::system( "pause" );
return 0;
}
Or you can simply define the vector as
std::vector<std::string> v( ( std::istream_iterator<std::string>( file ) ),
std::istream_iterator<std::string>() );
provided that you will include header <iterator>
For example
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <iterator>
#include <cstdlib>
int main()
{
std::ifstream file("p.txt");
std::vector<std::string> v( ( std::istream_iterator<std::string>( file ) ),
std::istream_iterator<std::string>() );
for ( const std::string &s : v ) std::cout << s << std::endl;
std::system( "pause" );
return 0;
}

Your code is invoking undefined behavior. That it doesn't crash is a roll of the dice, but its execution is not deterministic precisely because that is the nature of being undefined.
The easiest way (I've found) to load a file of words with whitespace separation is by:
std::ifstream inp("p.txt");
std::istream_iterator<std::string> inp_it(inp), inp_eof;
std::vector<std::string> strs(inp_it, inp_eof);
strs will contain every whitespace delimited char sequence as a linear vector of std::string. Use std::string for dynamic string content and don't feel the least bit guilty about exploiting the hell out of the hard work those that came before you gave us all: The Standard Library.

Your code is failing due to the overload of char * for operator>>.
An array of char, regardless the size, will decompose to the type char * where the value is the address of the start of the array.
For compatibility with the C language, the overloaded operator>>(char *) has been implemented to read one or more characters until a terminating whitespace character is reached, or there is an error with the stream.
If you declare an array of 1 character and read from a file containing "California", the function will put 'C' into the first location of the array and keep writing the remaining characters to the next locations in memory (regardless of what data type they are). This is known as a buffer overflow.
A much safer method is to read into a std::string or if you only want one character, use a char variable. Look in your favorite C++ reference for the getline methods. There is an overload for reading until a given delimiter is reached.

You only need a couple changes:
#include <cstring>
#include <fstream>
#include <stdio.h>
#include <string>
int main(){
ifstream file;
string word;
file.open("p.txt");
while (file >> word){
cout << word << endl;
}
system("pause");
return 0;
}

It works because you are lucky and you don't overwrite some critical memory. You need to allocate enough bytes for char word array, say char word[64]. And use while(file>>word) as your test for EOF. In the loop you can push_back the word into a std::vector<string> if you are allowed to use C++ STL.
If you want a simple C++11 STL-like solution, use this
#include <algorithm>
#include <iterator>
#include <vector>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
ifstream fin("./in.txt"); // input file
vector<string> words; // store the words in a vector
copy(istream_iterator<string>(fin),{}, back_inserter(words)); // insert the words
for(auto &elem: words)
cout << elem << endl; // display them
}
Or, more compactly, construct the container directly from the stream iterator like
vector<string> words(istream_iterator<string>(fin),{});
and remove the copy statement.
If instead a vector<string> you use a multiset<string> (#include <set>) and change
copy(istream_iterator<string>(fin),{}, back_inserter(words)); // insert the words
to
copy(istream_iterator<string>(fin),{}, inserter(words, words.begin())); // insert the words
you get the words ordered. So using STL is the cleanest approach in my opinion.

You're using C++, so you can avoid all that C stuff.
std::string word;
std::vector<std::string> words;
std::fstream stream("wordlist");
// this assumes one word (or phrase, with spaces, etc) per line...
while (std::getline(stream, word))
words.push_back(word);
or for multiple words (or phrases, with spaces, etc) per line separated by commas:
while (std::getline(stream, word, ','))
words.push_back(word);
or for multiple words per line separated by spaces:
while(stream >> word)
words.push_back(word);
No need to worry about buffer sizes or memory allocation or anything like that.

file>>char *
Will work with any char * and you are using
file >> word;
and it simply sees work variable as a char * but you are getting a segemntation fault somewhere and if your code grows you will see something is not working without any logical reason. GDB debugger will show you the seg fault

Related

How can I take a string input like this?

If I have a string containing unknown number of words, and I have to scan it in multiple strings in C++. How can I do it?
For eg:
"I am a boy". I want, each of these individual words to be in a string.
"My name is John Lui". Each of these as well.
One way that I could think of was to use, getline in c++ and then parse through the entire string until a character is found and store in seperate strings. I want to know is there a better method? Thanks!
Also, I want to know, that when using a delimiter in getline command, getline basically scans the input strings till the point delimiter is not found and puts that part of a string into a new string. However, I want to know, if the delimiter is not present at all, then what happens? Does it throw an exception or it takes input the whole string till the newline character? Thanks!
However you could use std::getline
Which uses a string instead of a char array. It's easier to use string
since they know their sizes, they auto grow etc. and you don't have to
worry about the null terminating character and so on. Also it is
possible to convert a char array to a string by using the appropriate
string contructor.
You can do it by stringstream:
// stringstream::str
#include <string> // std::string
#include <iostream> // std::cout
#include <sstream> // std::stringstream, std::stringbuf
using namespace std;
int main ()
{
std::string str;
getline( std::cin, str );
std::stringstream ss;
ss<<str;
std::string s;
while(ss>>s)
{
std::cout << s << '\n';
}
return 0;
}
Input: I am a boy
Output:
I
am
a
boy
If you think that, you want each word to store in a vector, you can do it like:
// stringstream::str
#include <string> // std::string
#include <iostream> // std::cout
#include <sstream> // std::stringstream, std::stringbuf
#include <vector>
using namespace std;
int main ()
{
vector <string> V;
V.clear();
std::string str;
getline( std::cin, str );
std::stringstream ss;
ss<<str;
std::string s,s1;
while(ss>>s)
{
V.push_back(s);
}
return 0;
}

Reading from a text file, best way to store data C++

Basicly I have a text file which i need t read-in the values so the program can manipulate them.
Im using C++ and i have written working code to tell if the file exists or not.
The text file is formatted like this:
1 7
8 10
20 6
3 14
...
The values on the left are X values and the values on the right are Y values. (The space in the middle is a tab)
How do I extract this data? say to pass them into a class like this...
myVector(X,Y);
Also, I guess before I can use it in a class I have to TryParse to change it from a string to int right? can C++ do this?
Thank you!
I would be writing something like this if I were you. Note, this is just prototype code, and it was not even tested.
The fundamental idea is to read twice in a line, but with different delimiters. You would read with the tab delimiter first, and then just the default line end.
You need to make sure to gracefully quit the loop when you do not have anything more to read, hence the breaks, albeit the second could be enough if your file is "correct".
You will also need to make sure to convert to the proper type that your vector class expects. I assumed here that is int, but if it is string, you do not need the conversion I have put in place.
#include <string>
#include <fstream>
using namespace std;
void yourFunction()
{
..
ifstream myfile("myfile.txt");
string xword, yword;
while (1) {
if (!getline(myfile, xword, '\t'))
break;
if (!getline(myfile, yword))
break;
myVector.push_back(stoi(xword), stoi(yword));
}
...
}
This sort of parsing could be done in one line with boost.spirit:
qi::phrase_parse(begin, end, *(qi::int_ > qi::int_ > qi::eol), qi::ascii::blank, v);
The grammar could be read as: "read one int, then one int, then one EOL (end of line) (\n or \r\n, depends on locale), as many time as possible". Between ints and EOL can be found blank characters (e.g. spaces or tabs).
Advantages: rather than std::getline loops, code is more clear/concise. spirit.qi get you more powerful control and you don't need stoi calls.
Drawbacks: build-depends (no depends) to spirit.qi, compilation time.
#include <iostream>
#include <fstream>
#include <vector>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/std_pair.hpp>
namespace spirit = boost::spirit;
namespace qi = spirit::qi;
int main(int argc, char **argv)
{
std::ifstream in(argv[1], std::ios_base::in);
std::string storage;
in.unsetf(std::ios::skipws);
spirit::istream_iterator begin(in), end;
std::vector<std::pair<int, int> > v;
qi::phrase_parse(begin, end, *(qi::int_ > qi::int_ > qi::eol), qi::ascii::blank, v);
for(const auto& p : v)
std::cout << p.first << "," << p.second << std::endl;
return 0;
}

C++ how to put an input string from stdio into a vector, one word per container element

I'm learning c++, and I'm a bit of a newbie. I've researched this question quite a bit. I've studied vectors, strings, and stringstreams in c++ but I still can't find the 'right' way to do this.
Basically, I want to write, "some text" at the command line and have "some" put into a vector container at position '0' and "text" put into the same container in position '1'.
I've found a lot of ways that sorta work, but nothing that just does that.
Thanks for the help.
As per your comment:
#include <string>
#include <iostream>
#include <sstream>
#include <vector>
#include <algorithm>
#include <iterator>
int main() {
std::string line;
std::getline(std::cin, line); // read one line from cin
std::stringstream buffer(line);
std::vector<std::string> words;
// copy each word from line to words
std::copy(std::istream_iterator<std::string>(buffer),
std::istream_iterator<std::string>(),
std::back_inserter(words));
}
You can simply use >> to achieve this effect.
std::vector<std::string> vector;
std::string string;
while(std::cin >> string)
vector.push_back(string);

How does this program work

This is my first C++ program. It prints the number of words in the input.
My first question, how does it go into the loop and add to the count? is it every time i type the space character? if so, how does it know I'm trying to count words?
using namespace std;
int main() {
int count;
string s;
count = 0;
while (cin >> s)
count++;
cout << count << '\n';
return 0;
}
My second question. Can someone explain to me what namespace std means for a begineer?
When you do cin >> string. You will read a word and put it in the string. Yes, it will read char by char until reach the delimiter.
Std means Standard. Standard C++ library is inside the std namespace. You can rewrite or code without the using namespace std:
int main() {
int count;
std::string s;
count = 0;
while (std::cin >> s)
count++;
std::cout << count << '\n';
return 0;
}
I discourage that novices use the using namespace std statement because it is harder to understand what is going on.
Cin will capture input until a space, yes. The specific style of loop you have will go until an End-Of-File (EOF) is found or until bad input is provided. That loop doesn't look like common C++ practice to me, but it's described here.
2.namespace std is how you tell the compiler where to look to find the objects you're referencing in your code. Because different objects are "inside" different namespaces, you either have to tell the compiler where they are specifically (aka std::cin) or tell it for convenience where an object you use will be in the future (with using namespace std).
In your code, cin >> s attempts to read a std::string from input stream. If the attempt succeeds, then the returned value of cin >> s implicitly converts into true and the while loop continues, incrementing the counter. Otherwise, the while loop exits when the attempt fails, as there is no more data to read from the input stream.
You can use std::distance to count the words, as shown below:
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
int main() {
std::istream_iterator<std::string> begin(std::cin), end;
size_t count = std::distance(begin, end);
std::cout << count << std::endl;
return 0;
}
Demo : http://www.ideone.com/Hldz3
In this code, you create two iterators begin and end, passing both to std::distance function. The function calculates the distance between begin and end. The distance is nothing but the number of strings in the input stream, because the iterator begin iterates over strings coming from the input stream, and end defines the end of the iterator where begin stops iterating. The reason why begin iterates over strings is because the template argument to std::istream_iterator is std::string:
std::istream_iterator<std::string> begin(std::cin), end;
//^^^^^^^^^^^
If you change this to char, then begin will iterator over char, which means the following program will count the number of characters in the input stream:
#include <iostream>
#include <algorithm>
#include <iterator>
int main() {
std::istream_iterator<char> begin(std::cin), end;
size_t count = std::distance(begin, end);
std::cout << count << std::endl;
return 0;
}
Demo : http://www.ideone.com/NH52y
Similarly, you can do many cool things if you start using iterators from <iterator> header and generic functions from <algorithm> header.
For example, let say we want to count the number of lines in the input stream. So what change would we make to the above program to get the job done? The way we change std::string to char when we wanted to count characters, immediately suggests that now we need to change it to line so that we could iterate over line (instead of char).
As no line class exist in the Standard library, we've to define one ourselves, but the interesting thing is that we can keep it empty as shown below, with full working code:
#include <iostream>
#include <algorithm>
#include <iterator>
#include <string>
struct line {}; //Interesting part!
std::istream& operator >>(std::istream & in, line &)
{
std::string s;
return std::getline(in, s);
}
int main() {
std::istream_iterator<line> begin(std::cin), end;
size_t count = std::distance(begin, end);
std::cout << count << std::endl;
return 0;
}
Yes, along with line, you've to define operator>> for line as well. It is used by std::istream_terator<line> class.
Demo : http://www.ideone.com/iKPA6

How to read and write a STL C++ string?

#include<string>
...
string in;
//How do I store a string from stdin to in?
//
//gets(in) - 16 cannot convert `std::string' to `char*' for argument `1' to
//char* gets (char*)'
//
//scanf("%s",in) also gives some weird error
Similarly, how do I write out in to stdout or to a file??
You are trying to mix C style I/O with C++ types. When using C++ you should use the std::cin and std::cout streams for console input and output.
#include <string>
#include <iostream>
...
std::string in;
std::string out("hello world");
std::cin >> in;
std::cout << out;
But when reading a string std::cin stops reading as soon as it encounters a space or new line. You may want to use std::getline to get a entire line of input from the console.
std::getline(std::cin, in);
You use the same methods with a file (when dealing with non binary data).
std::ofstream ofs("myfile.txt");
ofs << myString;
There are many way to read text from stdin into a std::string. The thing about std::strings though is that they grow as needed, which in turn means they reallocate. Internally a std::string has a pointer to a fixed-length buffer. When the buffer is full and you request to add one or more character onto it, the std::string object will create a new, larger buffer instead of the old one and move all the text to the new buffer.
All this to say that if you know the length of text you are about to read beforehand then you can improve performance by avoiding these reallocations.
#include <iostream>
#include <string>
#include <streambuf>
using namespace std;
// ...
// if you don't know the length of string ahead of time:
string in(istreambuf_iterator<char>(cin), istreambuf_iterator<char>());
// if you do know the length of string:
in.reserve(TEXT_LENGTH);
in.assign(istreambuf_iterator<char>(cin), istreambuf_iterator<char>());
// alternatively (include <algorithm> for this):
copy(istreambuf_iterator<char>(cin), istreambuf_iterator<char>(),
back_inserter(in));
All of the above will copy all text found in stdin, untill end-of-file. If you only want a single line, use std::getline():
#include <string>
#include <iostream>
// ...
string in;
while( getline(cin, in) ) {
// ...
}
If you want a single character, use std::istream::get():
#include <iostream>
// ...
char ch;
while( cin.get(ch) ) {
// ...
}
C++ strings must be read and written using >> and << operators and other C++ equivalents. However, if you want to use scanf as in C, you can always read a string the C++ way and use sscanf with it:
std::string s;
std::getline(cin, s);
sscanf(s.c_str(), "%i%i%c", ...);
The easiest way to output a string is with:
s = "string...";
cout << s;
But printf will work too:
[fixed printf]
printf("%s", s.c_str());
The method c_str() returns a pointer to a null-terminated ASCII string, which can be used by all standard C functions.