storing text to string from a file that contains spaces - c++

So I've been doing algorithms in C++ for about 3 months now as a hobby. I've never had a problem I couldn't solve by googleing up until now. I'm trying to read from a text file that will be converted into a hash table, but when i try and capture the data from a file it ends at a space. here's the code
#include <iostream>
#include <fstream>
int main()
{
using namespace std;
ifstream file("this.hash");
file >> noskipws;
string thing;
file >> thing;
cout << thing << endl;
return 0;
}
I'm aware of the noskipws flag i just don't know how to properly implement it

When using the formatted input operator for std::string it always stops at what the stream considers to be whitespace. Using the std::locale's character classification facet std::ctype<char> you can redefine what space means. It's a bit involved, though.
If you want to read up to a specific separator, you can use std::getline(), possibly specifying the separator you are interested in, e.g.:
std::string value;
if (std::getline(in, value, ',')) { ... }
reads character until it finds a comma or the end of the file is reached and stores the characters up to the separator in value.
If you just want to read the entire file, one way to do is to use
std::ifstream in(file.c_str());
std::string all((std::istreambuf_iterator<char>(in)), std::istreambuf_iterator<char>());

I think the best tool for what you're trying to do is get, getline or read. Now those all use char buffers rather than std::strings, so need a bit more thought, but they're quite straightforward really. (edit: std::getline( file, string ), as pointed out by Dietmar Kühl, uses c++ strings rather than character buffers, so I would actually recommend that. Then you won't need to worry about maximum line lengths)
Here's an example which will loop through the entire file:
#include <iostream>
int main () {
char buffer[1024]; // line length is limited to 1023 bytes
std::ifstream file( "this.hash" );
while( file.good( ) ) {
file.getline( buffer, sizeof( buffer ) );
std::string line( buffer ); // convert to c++ string for convenience
// do something with the line
}
return 0;
}
(note that line length is limited to 1023 bytes, and if a line is longer it will be broken into 2 reads. When it's a true newline, you'll see a \n character at the end of the string)
Of course, if you a maximum length for your file in advance, you can just set the buffer accordingly and do away with the loop. If the buffer needs to be very big (more than a couple of kilobytes), you should probably use new char[size] and delete[] instead of the static array, to avoid stack overflows.
and here's a reference page: http://www.cplusplus.com/reference/fstream/ifstream/

Related

How to read/write from a data file in C++

I'm having a problem reading from a binary file (*.dat) using the .read(reinterpret_cast (&x),sizeof(x)) command but there is always an error about the existence of the file even when the file exist or has been created successfully. Here is the code:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
struct x{
char name[10],pass[10];
};
int main()
{
x x1,x2;
fstream inout;
inout.open("test.dat" ,ios::binary);
if(!inout)
{
cout<<"Error";
exit(1);
}
cout<<"Enter your name:";
cin>>x1.name;
inout.write(reinterpret_cast <const char*> (&x1.name), sizeof(x1));
cout<<"Enter your name:";
cin>>x1.pass;
inout.write(reinterpret_cast <const char*> (&x1.pass), sizeof(x1));
while(inout.read(reinterpret_cast <char*> (&x2.name), sizeof(x1)))
{
cout<<x2.name;//here is my problem cannot read!!
}
inout.close();
}
Use std:flush after your write operations.
// ... Write x1.name and x1.pass
inout << std::flush;
// ... Read x2.name in while loop.
inout.close();
There is a problem with your output to the file.
First you are writing the struct x1 to the file where only the name field is filled
inout.write(reinterpret_cast <const char*> (&x1.name), sizeof(x1));
and afterwards:
inout.write(reinterpret_cast <const char*> (&x1.pass), sizeof(x1));
You start writing from the address of x1.pass but you are writing sizeof(x1) bytes.
sizeof(x1) is 20 here but its only 10 bytes from the start of x1.pass to the end of the struct, so you are writing 10 bytes of unknown data from the stack into your file.
So this is the first thing that your file may not contain what you expect it to contain.
The next thing is that after writing your data the stream is sitting at the end of the file and you try to read from there. You have to move the position back to the beginning of the stream to read the stuff you just wrote. For example use:
inout.seekg(std::ios::beg);
If you mess with read and write to the same stream, you'd rather use flush or file positioning functions.
MSDN says:
When a basic_fstream object is used to perform file I/O, although the underlying buffer contains separately designated positions for reading and writing, the current input and current output positions are tied together, and therefore, reading some data moves the output position.
GNU Stdlib:
As you can see, ‘+’ requests a stream that can do both input and output. When using such a stream, you must call fflush (see Stream Buffering) or a file positioning function such as fseek (see File Positioning) when switching from reading to writing or vice versa. Otherwise, internal buffers might not be emptied properly.
Reading into raw C-style arrays from an input stream is not as idiomatic as a simple call to operator>>(). You also have to prevent buffer overruns by keeping track of the both the bytes allocated for the buffer, and the bytes being read into the buffer.
Reading into the buffer can be done by using the input stream method getline(). The following example shows the extraction into x1.name; the same would be done for x1.path:
if (std::cin.getline(x1.name, sizeof(x1.name))) {
}
The second argument is the maximum number of bytes to be read. It is useful in that the stream won't write pass the allocated bounds of the array. The next thing to do is just write it to the file as you have done:
if (std::cin.getline(x1.name, sizeof(x1.name))) {
inout.write(reinterpret_cast<char*>(&x1.name), std::cin.gcount());
}
std::cin.gcount() is the number of characters that were read from the input stream. It is a much more reliable alternative to sizeof(x1.name) in that it returns the number of characters written, not the characters allotted.
Now, bidirectional file streams are a bit tricky. They have be coordinated in the right way. As explained in the other answers, bidirectional file streams (or std::fstreams) share a joint buffer for both input and output. The position indicators that mark positions in the input and output sequence are both affected by any input and output operations that may occur. As such, the file stream position has to be "moved" back before performing input. This can be done by either a call to seekg() or seekp(). Either will suffice since, as I said, the position indicators are bound to each other:
if (std::cin.getline(x1.pass, sizeof(x1.pass))) {
inout.write(reinterpret_cast<char*>(&x1.pass), std::cin.gcount());
inout.seekg(0, std::ios_base::beg);
}
Notice how this was done after the extraction into x1.pass. We can't do it after x1.name because we would be overwriting the stream on the second call to write().
As you can see, extracting into raw C-style arrays isn't pretty, you have to manage more things than you should. Fortunately, C++ comes to the rescue with their standard string class std::string. Use this for more efficient I/O:
Make both name and pass standard C++ strings (std::string) instead of raw C-arrays. This allows you pass in the size as the second argument to your read() and write() calls:
#include <string>
struct x {
std::string name;
std::string pass;
};
// ...
if (std::cin >> x1.name) {
inout.write(x1.name.data(), x1.name.size());
}
if (std::cin >> x1.pass) {
inout.write(x1.name.data(), x1.name.size());
inout.seekg(0, std::ios_base::beg);
}
std::string allows us to leverage its dynamic nature and its capacity for maintaining the size of the buffer. We no longer have to use getline() but now a simple call to operator>>() and an if() check.
This was not possible before, but now that we're using std::string we can also combine both extractions to achieve the following:
if (std::cout << "Enter your name: " && std::cin >> x1.name &&
std::cout << "Enter your pass: " && std::cin >> x1.pass) {
inout.write(x1.name.data(), x1.name.size());
inout.write(x1.pass.data(), x1.pass.size());
inout.seekg(0, std::ios_base::beg);
}
And finally, the last extraction would simply be this:
while (inout >> x2.name)
{
std::cout << x2.name;
}

Length of a char array

I have the code like this:
#include <iostream.h>
#include <fstream.h>
void main()
{
char dir[25], output[10],temp[10];
cout<<"Enter file: ";
cin.getline(dir,25); //like C:\input.txt
ifstream input(dir,ios::in);
input.getline(output,'\eof');
int num = sizeof(output);
ofstream out("D:\\size.txt",ios::out);
out<<num;
}
I want to print the length of the output. But it always returns the number 10 (the given length) even if the input file has only 2 letters ( Like just "ab"). I've also used strlen(output) but nothing changed. How do I only get the used length of array?
I'm using VS C++ 6.0
sizeof operator on array gives you size allocated for the array, which is 10.
You need to use strlen() to know length occupied inside the array, but you need to make sure the array is null terminated.
With C++ better alternative is to simple use: std::string instead of the character array. Then you can simply use std::string::size() to get the size.
sizeof always prints the defined size of an object based on its type, not anything like the length of a string.
At least by current standards, your code has some pretty serious problems. It looks like it was written for a 1993 compiler running on MS-DOS, or something on that order. With a current compiler, the C++ headers shouldn't have .h on the end, among other things.
#include <iostream>
#include <fstream>
#include <string>
int main() {
std::string dir, output, temp;
std::cout<<"Enter file: ";
std::getline(cin, dir); //like C:\input.txt
std::ifstream input(dir.c_str());
std::getline(input, output);
std::ofstream out("D:\\size.txt");
out<<output.size();
}
The getline that you are using is an unformatted input function so you can retrieve the number of characters extracted with input.gcount().
Note that \e is not a standard escape sequence and the character constant \eof almost certainly doesn't do what you think it does. If you don't want to recognise any delimiter you should use read, not getline, passing the size of your buffer so that you don't overflow it.

How to read a specific amount of characters from a text file

I tried to do it like this
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
char b[2];
ifstream f("prad.txt");
f>>b ;
cout <<b;
return 0;
}
It should read 2 characters but it reads whole line. This worked on another language but doesn't work in C++ for some reason.
You can use read() to specify the number of characters to read:
char b[3] = "";
ifstream f("prad.txt");
f.read(b, sizeof(b) - 1); // Read one less that sizeof(b) to ensure null
cout << b; // terminated for use with cout.
This worked on another language but doesn't work in C++ for some
reason.
Some things change from language to language. In particular, in this case you've run afoul of the fact that in C++ pointers and arrays are scarcely different. That array gets passed to operator>> as a pointer to char, which is interpreted as a string pointer, so it does what it does to char buffers (to wit read until the width limit or end of line, whichever comes first). Your program ought to be crashing when that happens, since you're overflowing your buffer.
istream& get (char* s, streamsize n );
Extracts characters from the stream and stores them as a c-string into
the array beginning at s. Characters are extracted until either (n -
1) characters have been extracted or the delimiting character '\n' is
found. The extraction also stops if the end of file is reached in the
input sequence or if an error occurs during the input operation. If
the delimiting character is found, it is not extracted from the input
sequence and remains as the next character to be extracted. Use
getline if you want this character to be extracted (and discarded).
The ending null character that signals the end of a c-string is
automatically appended at the end of the content stored in s.

Reading a fixed number of chars with << on an istream

I was trying out a few file reading strategies in C++ and I came across this.
ifstream ifsw1("c:\\trys\\str3.txt");
char ifsw1w[3];
do {
ifsw1 >> ifsw1w;
if (ifsw1.eof())
break;
cout << ifsw1w << flush << endl;
} while (1);
ifsw1.close();
The content of the file were
firstfirst firstsecond
secondfirst secondsecond
When I see the output it is printed as
firstfirst
firstsecond
secondfirst
I expected the output to be something like:
fir
stf
irs
tfi
.....
Moreover I see that "secondsecond" has not been printed. I guess that the last read has met the eof and the cout might not have been executed. But the first behavior is not understandable.
The extraction operator has no concept of the size of the ifsw1w variable, and (by default) is going to extract characters until it hits whitespace, null, or eof. These are likely being stored in the memory locations after your ifsw1w variable, which would cause bad bugs if you had additional variables defined.
To get the desired behavior, you should be able to use
ifsw1.width(3);
to limit the number of characters to extract.
It's virtually impossible to use std::istream& operator>>(std::istream&, char *) safely -- it's like gets in this regard -- there's no way for you to specify the buffer size. The stream just writes to your buffer, going off the end. (Your example above invokes undefined behavior). Either use the overloads accepting a std::string, or use std::getline(std::istream&, std::string).
Checking eof() is incorrect. You want fail() instead. You really don't care if the stream is at the end of the file, you care only if you have failed to extract information.
For something like this you're probably better off just reading the whole file into a string and using string operations from that point. You can do that using a stringstream:
#include <string> //For string
#include <sstream> //For stringstream
#include <iostream> //As before
std::ifstream myFile(...);
std::stringstream ss;
ss << myFile.rdbuf(); //Read the file into the stringstream.
std::string fileContents = ss.str(); //Now you have a string, no loops!
You're trashing the memory... its reading past the 3 chars you defined (its reading until a space or a new line is met...).
Read char by char to achieve the output you had mentioned.
Edit : Irritate is right, this works too (with some fixes and not getting the exact result, but that's the spirit):
char ifsw1w[4];
do{
ifsw1.width(4);
ifsw1 >> ifsw1w;
if(ifsw1.eof()) break;
cout << ifsw1w << flush << endl;
}while(1);
ifsw1.close();
The code has undefined behavior. When you do something like this:
char ifsw1w[3];
ifsw1 >> ifsw1w;
The operator>> receives a pointer to the buffer, but has no idea of the buffer's actual size. As such, it has no way to know that it should stop reading after two characters (and note that it should be 2, not 3 -- it needs space for a '\0' to terminate the string).
Bottom line: in your exploration of ways to read data, this code is probably best ignored. About all you can learn from code like this is a few things you should avoid. It's generally easier, however, to just follow a few rules of thumb than try to study all the problems that can arise.
Use std::string to read strings.
Only use fixed-size buffers for fixed-size data.
When you do use fixed buffers, pass their size to limit how much is read.
When you want to read all the data in a file, std::copy can avoid a lot of errors:
std::vector<std::string> strings;
std::copy(std::istream_iterator<std::string>(myFile),
std::istream_iterator<std::string>(),
std::back_inserter(strings));
To read the whitespace, you could used "noskipws", it will not skip whitespace.
ifsw1 >> noskipws >> ifsw1w;
But if you want to get only 3 characters, I suggest you to use the get method:
ifsw1.get(ifsw1w,3);

How to read and write a STL C++ string?

#include<string>
...
string in;
//How do I store a string from stdin to in?
//
//gets(in) - 16 cannot convert `std::string' to `char*' for argument `1' to
//char* gets (char*)'
//
//scanf("%s",in) also gives some weird error
Similarly, how do I write out in to stdout or to a file??
You are trying to mix C style I/O with C++ types. When using C++ you should use the std::cin and std::cout streams for console input and output.
#include <string>
#include <iostream>
...
std::string in;
std::string out("hello world");
std::cin >> in;
std::cout << out;
But when reading a string std::cin stops reading as soon as it encounters a space or new line. You may want to use std::getline to get a entire line of input from the console.
std::getline(std::cin, in);
You use the same methods with a file (when dealing with non binary data).
std::ofstream ofs("myfile.txt");
ofs << myString;
There are many way to read text from stdin into a std::string. The thing about std::strings though is that they grow as needed, which in turn means they reallocate. Internally a std::string has a pointer to a fixed-length buffer. When the buffer is full and you request to add one or more character onto it, the std::string object will create a new, larger buffer instead of the old one and move all the text to the new buffer.
All this to say that if you know the length of text you are about to read beforehand then you can improve performance by avoiding these reallocations.
#include <iostream>
#include <string>
#include <streambuf>
using namespace std;
// ...
// if you don't know the length of string ahead of time:
string in(istreambuf_iterator<char>(cin), istreambuf_iterator<char>());
// if you do know the length of string:
in.reserve(TEXT_LENGTH);
in.assign(istreambuf_iterator<char>(cin), istreambuf_iterator<char>());
// alternatively (include <algorithm> for this):
copy(istreambuf_iterator<char>(cin), istreambuf_iterator<char>(),
back_inserter(in));
All of the above will copy all text found in stdin, untill end-of-file. If you only want a single line, use std::getline():
#include <string>
#include <iostream>
// ...
string in;
while( getline(cin, in) ) {
// ...
}
If you want a single character, use std::istream::get():
#include <iostream>
// ...
char ch;
while( cin.get(ch) ) {
// ...
}
C++ strings must be read and written using >> and << operators and other C++ equivalents. However, if you want to use scanf as in C, you can always read a string the C++ way and use sscanf with it:
std::string s;
std::getline(cin, s);
sscanf(s.c_str(), "%i%i%c", ...);
The easiest way to output a string is with:
s = "string...";
cout << s;
But printf will work too:
[fixed printf]
printf("%s", s.c_str());
The method c_str() returns a pointer to a null-terminated ASCII string, which can be used by all standard C functions.