Guido Van Rossum demonstrates the simplicity of Python in this article and makes use of this function for buffered reads of a file of unknown length:
def intsfromfile(f):
while True:
a = array.array('i')
a.fromstring(f.read(4000))
if not a:
break
for x in a:
yield x
I need to do the same thing in C++ for speed reasons! I have many files containing sorted lists of unsigned 64 bit integers that I need to merge. I have found this nice piece of code for merging vectors.
I am stuck on how to make an ifstream for a file of unknown length present itself as a vector which can be happily iterated over until the end of the file is reached. Any suggestions? Am I barking up the correct tree with an istreambuf_iterator?
In order to disguise an ifstream (or really, any input stream) in a form that acts like an iterator, you want to use the istream_iterator or the istreambuf_iterator template class. The former is useful for files where the formatting is of concern. For example, a file full of whitespace-delimited integers can be read into the vector's iterator range constructor as follows:
#include <fstream>
#include <vector>
#include <iterator> // needed for istream_iterator
using namespace std;
int main(int argc, char** argv)
{
ifstream infile("my-file.txt");
// It isn't customary to declare these as standalone variables,
// but see below for why it's necessary when working with
// initializing containers.
istream_iterator<int> infile_begin(infile);
istream_iterator<int> infile_end;
vector<int> my_ints(infile_begin, infile_end);
// You can also do stuff with the istream_iterator objects directly:
// Careful! If you run this program as is, this won't work because we
// used up the input stream already with the vector.
int total = 0;
while (infile_begin != infile_end) {
total += *infile_begin;
++infile_begin;
}
return 0;
}
istreambuf_iterator is used to read through files a single character at a time, disregarding the formatting of the input. That is, it will return you all characters, including spaces, newline characters, and so on. Depending on your application, that may be more appropriate.
Note: Scott Meyers explains in Effective STL why the separate variable declarations for istream_iterator are needed above. Normally, you would do something like this:
ifstream infile("my-file.txt");
vector<int> my_ints(istream_iterator<int>(infile), istream_iterator<int>());
However, C++ actually parses the second line in an incredibly bizarre way. It sees it as the declaration of a function named my_ints that takes in two parameters and returns a vector<int>. The first parameter is of type istream_iterator<int> and is named infile (the parantheses are ignored). The second parameter is a function pointer with no name that takes zero arguments (because of the parantheses) and returns an object of type istream_iterator<int>.
Pretty cool, but also pretty aggravating if you're not watching out for it.
EDIT
Here's an example using the istreambuf_iterator to read in a file of 64-bit numbers laid out end-to-end:
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
using namespace std;
int main(int argc, char** argv)
{
ifstream input("my-file.txt");
istreambuf_iterator<char> input_begin(input);
istreambuf_iterator<char> input_end;
// Fill a char vector with input file's contents:
vector<char> char_input(input_begin, input_end);
input.close();
// Convert it to an array of unsigned long with a cast:
unsigned long* converted = reinterpret_cast<unsigned long*>(&char_input[0]);
size_t num_long_elements = char_input.size() * sizeof(char) / sizeof(unsigned long);
// Put that information into a vector:
vector<unsigned long> long_input(converted, converted + num_long_elements);
return 0;
}
Now, I personally rather dislike this solution (using reinterpret_cast, exposing char_input's array), but I'm not familiar enough with istreambuf_iterator to comfortably use one templatized over 64-bit characters, which would make this much easier.
Related
I am trying to store binary data that should have the type of a std::complex< float > into a vector, through iterating over each element of the stream buffer. However I keep getting an error saying
no matching function for call to ‘std::istreambuf_iterator<std::complex<float> >::istreambuf_iterator(std::ifstream&)’
std::for_each(std::istreambuf_iterator<std::complex<float> >(i_f1),
I've tried searching for a solution but cannot find anything that would work. I am also trying to follow an example given in How to read entire stream into a std::vector? . Furthermore I'm compiling using g++ and -std=c++11.
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cmath>
#include <boost/tuple/tuple.hpp>
#include <algorithm>
#include <iterator>
int main(){
//path to files
std::string data_path= "/$HOME/some_path/";
//file to be opened
std::string f_name1 = "ch1_d2.dat";
std::ifstream i_f1(data_path + f_name1, std::ios::binary);
if (!i_f1){
std::cout << "Error occurred reading file "<<f_name1 <<std::endl; std::cout << "Exiting" << std::endl;
return 0;
}
//Place buffer contents into vector
std::vector<std::complex<float> > data1;
std::for_each(std::istreambuf_iterator<std::complex<float> >(i_f1),
std::istreambuf_iterator<std::complex<float> >(),
[&data1](std::complex<float> vd){
data1.push_back(vd);
});
// Test to see if vector was read in correctly
for (auto i = data1.begin(); i != data1.end(); i++){
std::cout << *i << " ";
}
i_f1.close();
return 0;
}
I am quite lost at what I'm doing wrong, and am thus wondering why the
std::istreambuf_iterator()
does not accept the stream I am giving it as parameter?
Also the error message is confusing me as it seems to imply that I am calling the function in a wrong way, or a function that is non-existent.
Thanks
You want to read std::complex from i_f1 (which is a std::ifstream) using operator>> for std::complex, so you need a std::istream_iterator instead of std::istreambuf_iterator1:
std::for_each(std::istream_iterator<std::complex<float> >(i_f1),
std::istream_iterator<std::complex<float> >(),
[&data1](std::complex<float> vd){
data1.push_back(vd);
});
Your code can actually be simplified to:
std::vector<std::complex<float>> data1{
std::istream_iterator<std::complex<float>>(i_f1),
std::istream_iterator<std::complex<float>>()};
1 std::istreambuf_iterator is used to iterate character per character on, e.g., a std::basic_istream, not to iterate over it using overloads of operator>>.
You're probably using the wrong tool for the job.
You're trying to use a buffer iterator, which iterates over the constituent parts of a stream's buffer. But you're telling your computer that the buffer is one of complex<float>s … it isn't. An ifstream's buffer is of chars. Hence the constructor you're trying to use (one that takes an ifstream with a buffer of complex<float>) does not exist.
You can use an istream_iterator to perform a formatted iteration, i.e. to use the stream's magical powers (in this case, lexically interpreting input as complex<float>s) rather than directly accessing its underlying bytes.
You can read more on the previous question "the difference betwen istreambuf_iterator and istream_iterator".
The example you linked to does also go some way to explaining this.
I have a file that has three ints on three rows. It looks like this:
000
001
010
And I'm trying to read each integer into the vector positions but I don't know if I'm doing it right. Here is my code:
#include <fstream>
#include <iterator>
#include <vector>
int main()
{
std::vector<int> numbers;
std::fstream out("out.txt");
std::copy(std::ostreambuf_iterator<int>(out.rdbuf()),
std::ostreambuf_iterator<int>(), std::back_inserter(numbers));
}
What am I doing wrong here? I'm getting a "no matching function call" error on the line where I do the copy.
You're using wrong iterator.
You need istreambuf_iterator, not ostreambuf_iterator:
std::copy(std::istreambuf_iterator<int>(out.rdbuf()),
std::istreambuf_iterator<int>(), std::back_inserter(numbers));
Note that ostreambuf_iterator is an output iterator. It is used to write, not read. What you want to do is, read for which you need istreambuf_iterator.
But wait! The above code is not going to work either, Why?
Because you're using istreambuf_iterator and passing int to it. The istreambuf_iterator reads data as unformatted buffer of type either char* or wchar_t*. The template argument to istreambuf_iterator could be either char or wchar_t.
What you actually need is called istream_iterator which reads formatted data of given type:
std::copy(std::istream_iterator<int>(out), //changed here also!
std::istream_iterator<int>(), std::back_inserter(numbers));
This will work great now.
Note that you could just avoid using std::copy, and use the constructor of std::vector itself as:
std::fstream in("out.txt");
std::vector<int> numbers((std::istream_iterator<int>(in)), //extra braces
std::istream_iterator<int>());
Note the extra braces around first argument which is used to avoid vexing parse in C+++.
If the vector object is already created (and optionally it has some elements in it), then you can still avoid std::copy as:
numbers.insert(numbers.end(),
std::istream_iterator<int>(in), //no extra braces
std::istream_iterator<int>());
No extra braces needed in this case.
Hope that helps.
Read the Book 'C++ How To Program' by Dietal & Dietal, The chapter on Vectors. I assure you, all your problems will be solved. You have opened the text file for output instead of input. Instead of using this function I would suggest that you should read-in strings and copy them into your vector using iterators until EOF is encountered in the file. EDIT: This way is more natural and easy to read and understand if you are new to Vectors.
I have the code like this:
#include <iostream.h>
#include <fstream.h>
void main()
{
char dir[25], output[10],temp[10];
cout<<"Enter file: ";
cin.getline(dir,25); //like C:\input.txt
ifstream input(dir,ios::in);
input.getline(output,'\eof');
int num = sizeof(output);
ofstream out("D:\\size.txt",ios::out);
out<<num;
}
I want to print the length of the output. But it always returns the number 10 (the given length) even if the input file has only 2 letters ( Like just "ab"). I've also used strlen(output) but nothing changed. How do I only get the used length of array?
I'm using VS C++ 6.0
sizeof operator on array gives you size allocated for the array, which is 10.
You need to use strlen() to know length occupied inside the array, but you need to make sure the array is null terminated.
With C++ better alternative is to simple use: std::string instead of the character array. Then you can simply use std::string::size() to get the size.
sizeof always prints the defined size of an object based on its type, not anything like the length of a string.
At least by current standards, your code has some pretty serious problems. It looks like it was written for a 1993 compiler running on MS-DOS, or something on that order. With a current compiler, the C++ headers shouldn't have .h on the end, among other things.
#include <iostream>
#include <fstream>
#include <string>
int main() {
std::string dir, output, temp;
std::cout<<"Enter file: ";
std::getline(cin, dir); //like C:\input.txt
std::ifstream input(dir.c_str());
std::getline(input, output);
std::ofstream out("D:\\size.txt");
out<<output.size();
}
The getline that you are using is an unformatted input function so you can retrieve the number of characters extracted with input.gcount().
Note that \e is not a standard escape sequence and the character constant \eof almost certainly doesn't do what you think it does. If you don't want to recognise any delimiter you should use read, not getline, passing the size of your buffer so that you don't overflow it.
Is there any way to get input from a file one number at a time?
For example I want to store the following integer in an vector of integers since it is so long and can't be held by even a long long int.
12345678901234567900
So how can I read this number from a file so that I can:
vector<int> numbers;
number.push_back(/*>>number goes here<<*/)
I know that the above code isn't really complete but I hope that it explains what I am trying to do.
Also I've tried google and so far it has proved innefective because only tutorials for C are coming up which aren't really helping me all too much.
Thank is advance,
Dan Chevalier
This could be done in a variety of ways, all of them boiling down to converting each char '0'..'9' to the corresponding integer 0..9. Here's how it can be done with a single function call:
#include <string>
#include <iostream>
#include <vector>
#include <iterator>
#include <functional>
#include <algorithm>
int main()
{
std::string s = "12345678901234567900";
std::vector<int> numbers;
transform(s.begin(), s.end(), back_inserter(numbers),
std::bind2nd(std::minus<char>(), '0'));
// output
copy(numbers.begin(), numbers.end(),
std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
}
When reading from a file, you could read the string and transform(), or even transform() directly from istream iterators, if there is nothing else in that file besides your number:
std::ifstream f("test.txt");
std::vector<int> numbers;
transform(std::istream_iterator<char>(f),
std::istream_iterator<char>(),
back_inserter(numbers),
std::bind2nd(std::minus<char>(), '0'));
Off the top of my head this should fill up a character array which you can then iterate through. I realize it's not exactly what you were after but it's my preferred method.
void readfile(char *string)
{
ifstream NumberFile;
NumberFile.open("./Number"); //For a unix file-system
NumberFile >> string;
NumberFile.close();
}
Also, to perform operations on the actual numbers you can use:
CharacterArray[ElementNumber] - '0'
and to get the number when it is small enough to fit in a datatype you add each element of the array multiplied by 10 to the power of its index.
You can read a char at a time with char c; cin.get(c); and convert it to the numeral with c -= '0'. But perhaps you can just read it as a string or use something like BigNum.
Basically my task is having to sort a bunch of strings of variable length ignoring case. I understand there is a function strcasecmp() that compares cstrings, but doesn't work on strings. Right now I'm using getline() for strings so I can just read in the strings one line at a time. I add these to a vector of strings, then convert to cstrings for each call of strcasecmp(). Instead of having to convert each string to a cstring before comparing with strcasecmp(), I was wondering if there was a way I could use cin.getline() for cstrings without having a predefined char array size. Or, would the best solution be to just read in string, convert to cstring, store in vector, then sort?
I assume by "convert to cstring" you mean using the c_str() member of string. If that is the case, in most implementation that isn't really a conversion, it's just an accessor. The difference is only important if you are worried about performance (which it sounds like you are). Internally std::strings are (pretty much always, but technically do not have to be) represented as a "cstring". The class takes care of managing it's size for you, but it's just a dynamically allocated cstring underneath.
So, to directly answer: You have to specify the size of the array when using cin.getline. If you don't want to specify a size, then use getline and std::string. There's nothing wrong with that approach.
C++ is pretty efficient on its own. Unless you have a truly proven need to do otherwise, let it do its thing.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
#include <cstring>
using namespace std;
bool cmp(string a, string b)
{
return(strcasecmp(a.c_str(), b.c_str()) < 0);
}
int main(int argc, char *argv[])
{
vector<string> strArr;
//too lazy to test with getline(cin, str);
strArr.push_back("aaaaa");
strArr.push_back("AAAAA");
strArr.push_back("ababab");
strArr.push_back("bababa");
strArr.push_back("abcabc");
strArr.push_back("cbacba");
strArr.push_back("AbCdEf");
strArr.push_back("aBcDeF");
strArr.push_back(" whatever");
sort(strArr.begin(), strArr.end(), cmp);
copy(strArr.begin(), strArr.end(), ostream_iterator<string>(cout, " \n"));
return(0);
}