I read an answer here showing how to read an entire stream into a std::string with the following one (two) liner:
std::istreambuf_iterator<char> eos;
std::string s(std::istreambuf_iterator<char>(stream), eos);
For doing something similar to read a binary stream into a std::vector, why can't I simply replace char with uint8_t and std::string with std::vector?
auto stream = std::ifstream(path, std::ios::in | std::ios::binary);
auto eos = std::istreambuf_iterator<uint8_t>();
auto buffer = std::vector<uint8_t>(std::istreambuf_iterator<uint8_t>(stream), eos);
The above produces a compiler error (VC2013):
1>d:\non-svn\c++\library\i\file\filereader.cpp(62): error C2440:
'' : cannot convert from
'std::basic_ifstream>' to
'std::istreambuf_iterator>' 1>
with 1> [ 1> _Elem=uint8_t 1> ] 1>
No constructor could take the source type, or constructor overload
resolution was ambiguous
There's just a type mismatch. ifstream is just a typedef:
typedef basic_ifstream<char> ifstream;
So if you want to use a different underlying type, you just have to tell it:
std::basic_ifstream<uint8_t> stream(path, std::ios::in | std::ios::binary);
auto eos = std::istreambuf_iterator<uint8_t>();
auto buffer = std::vector<uint8_t>(std::istreambuf_iterator<uint8_t>(stream), eos);
That works for me.
Or, since Dietmar says this might be a little sketchy, you could do something like:
auto stream = std::ifstream(...);
std::vector<uint8_t> data;
std::for_each(std::istreambuf_iterator<char>(stream),
std::istreambuf_iterator<char>(),
[&data](const char c){
data.push_back(c);
});
ifstream is a stream of char, not uint8_t. You'll need either basic_ifstream<uint8_t> or istreambuf_iterator<char> for the types to match.
The former may not work without some amount of work, since the library is only required to support streams of char and wchar_t; so you probably want istreambuf_iterator<char>.
Related
I am trying to read a .WAV file in C++ into a vector of binary data:
typedef std::istreambuf_iterator<char> file_iterator;
std::ifstream file(path, std::ios::in | std::ios::binary);
if (!file.is_open()) {
throw std::runtime_error("Failed to open " + path);
}
std::vector<std::byte> content((file_iterator(file)), file_iterator());
When I attempt to compile this code I get an error:
Cannot convert 'char' to 'std::byte' in initialization
However if I change the vector to std::vector<unsigned char> it works fine.
Looking at the documentation for std::byte it looks like it is supposed to act like an unsigned char so I'm not sure where the compiler is getting confused.
Is there any particular way you are supposed to go about reading a file into a vector of bytes? (I am looking for a modern C++ approach)
I am using MinGW 7.3.0 as my compiler.
EDIT:
This question is not a duplicate because I am specifically concerned about modern C++ techniques and the use of std::byte which is not discussed in that question.
std::byte is a scoped enum. As such, there are restrictions on conversion to the type that do not exist for fundamental types like char.
Because the underlying type of std::byte is unsigned char, you can't convert a (signed) char to a byte during initialization because the conversion is a narrowing conversion.
One solution is to use a vector of unsigned char to store the file content. Since a byte is not an arithmetic type, many of the numeric operations do not exist for byte (only the bitwise ones).
If you must use std::byte, define the iterator and fstream using that type:
typedef std::istreambuf_iterator<std::byte> file_iterator;
std::basic_ifstream<std::byte> file(path, std::ios::in | std::ios::binary);
When I use this code
std::string filename = "tmp.bin";
std::ifstream fileStream;
std::vector<unsigned char> fileBuffer;
fileStream = std::ifstream(filename.c_str(), std::ios::binary | std::ios::ate);
fileBuffer.reserve(fileStream.tellg());
fileStream.seekg(0, std::ios::beg);
fileBuffer.insert(fileBuffer.begin(), std::istream_iterator<BYTE>(fileStream), std::istream_iterator<BYTE>());
all original spaces in my binary file are skipped -> fileBuffer contains no spaces, but need all tokens for Base64 encoding.
What is wrong here?
You need to use std::istreambuf_iterator<char>, istream_iterator uses operator>> to extract data, which for char and unsigned char will skip whitespace by default.
Side note: filebufs in C++ are defined in terms of the C standard, which has the following to say in a note regarding seeking to the end of binary files:
Setting the file position indicator to end-of-file, as with fseek(file, 0, SEEK_END), has undefined behavior for a binary stream (because of possible trailing null characters) or for any stream with state-dependent encoding that does not assuredly end in the initial shift state.
It'll probably work fine regardless, but unless the reallocations are a serious issue you should just one-shot the file
std::ifstream fileStream("tmp.bin", std::ios::binary);
std::vector<char> fileBuffer{
std::istreambuf_iterator<char>(fileStream),
std::istreambuf_iterator<char>()
};
Older C++ will need to avoid a vexing parse with
std::vector<char> fileBuffer(
(std::istreambuf_iterator<char>(fileStream)),
std::istreambuf_iterator<char>()
);
If your library has char_traits for unsigned char you could also use std::basic_ifstream<unsigned char> although this isn't portable, you can always convert to unsigned char later anyway depending on what you need.
I have:
typedef unsigned char;
std::vector<byte> data;
I tried to save data in file this way (but I have error):
fstream file(filename,ios::out);
file.write(&data, data.size());
How to process or cast data to write it in file.
To store a vector in a file, you have to write the contents of the vector, not the vector itself. You can access the raw data with &vector[0], address of the first element (given it contains at least one element).
ofstream outfile(filename, ios::out | ios::binary);
outfile.write(&data[0], data.size());
This should be fairly efficient at writing. fstream is generic, use ofstream if you are going to write.
*Statement file.write(&buffer[0],buffer.size()) makes error:
error C2664: 'std::basic_ostream<_Elem,_Traits>::write' : cannot
convert parameter 1 from 'unsigned char *' to 'const char *'
*In my compiler (VS2008) I don't have data() method for vector.
I think below is correct:
file.write((const char*)&buffer[0],buffer.size());
Use vector::data to get a pointer the the underlying data:
file.write(data.data(), data.size());
You are to pass the address of the first element, not the address of the vector object itself.
&data[0]
Note: Make sure that the vector is not empty before doing this.
A lot of these solutions are only partially complete (lacking includes & casts), so let me post a full working example:
#include <vector>
#include <fstream>
int main()
{
std::vector<std::byte> dataVector(10, std::byte{ 'Z' });
const std::string filename = "C:\\test_file.txt";
std::ofstream outfile(filename, std::ios::out | std::ios::binary);
outfile.write(reinterpret_cast<const char*>(dataVector.data()), dataVector.size());
return 0;
}
I think that ostream.write (myVector[0], ...) will not work, as it does not work in reading into vector (that I had.)
What works is ostream.write(MyVector.data(), ...)
Note: for reading use ifstream.read(const_cast(MyVector.data()), ...)
I've been trying to compress strings and save them to text files, then read the data and decompress it. When I try to decompress the read string, however, I get a Z_BUF_ERROR (-5) and the string may or may not decompress.
In the console, I can compress/decompress all day:
std::string s = zlib_compress("HELLO asdfasdf asdf asdfasd f asd f asd f awefo#8 892y*(#Y");
std::string e = zlib_decompress(s);
The string e will return the original string with no difficulty.
However, when I do this:
zlib_decompress(readFile(filename));
I get a Z_BUF_ERROR. I think it might be due in part to hidden characters in files, but I'm not really sure.
Here's my readFile function:
std::string readFile(std::string filename)
{
std::ifstream file;
file.open(filename.c_str(), std::ios::binary);
file.seekg (0, std::ios::end);
int length = file.tellg();
file.seekg (0, std::ios::beg);
char * buffer = new char[length];
file.read(buffer, length);
file.close();
std::string data(buffer);
return data;
}
When I write the compressed data, I use:
void writeFile(std::string filename, std::string data)
{
std::ofstream file;
file.open(filename.c_str(), std::ios::binary);
file << data;
file.close();
}
If needed, I'll show the functions I use to de/compress, but if it works without the File IO, I feel that the problem is an IO problem.
First, you're dealing with binary data that might or might not have embedded null characters. std::string isn't really the correct container for that, although you can handle embedded null characters if you do it correctly. However, using a std::string to store something documents a certain expectation and you're breaking that convention.
Second, the line std::string data(buffer); isn't doing what you think it does - that is the constructor you're supposed to use to construct a string from a null-terminated C string. You're dealing with binary data here so there is a chance that you're either don't get the full buffer into the string because it encounters a null terminator in the middle of the buffer, or it runs off the end of the buffer until it finds a null (or a seg fault). If you absolutely, positively must use a std::string, use the "correct" constructor, which would be std::string data(buffer, length);.
All that said, you are using the wrong data structure - what you want is a dynamic array of char/unsigned char. That would be a std::vector, not a std::string. As an aside, you should also pass the parameters to readFileand writeFile by const reference, the code that you wrote will make copies of the strings and if the buffer you pass into writeFile() is large, that will lead to an unpleasant hit in memory consumption and performance, plus it is completely unnecessary.
As the file might contain '\0' characters, you should specify the size when you assign the content to the std::string.
std::string data(buffer, length);
For what it's worth, here's how you could alter readFile() and writeFile():
std::vector<char> readFile(const std::string& filename)
{
std::ifstream file;
file.open(filename.c_str(), std::ios::binary);
file.seekg (0, std::ios::end);
const int length = file.tellg();
file.seekg (0, std::ios::beg);
std::vector<char> data(length);
file.read(&data[0], length);
file.close();
return data;
}
void writeFile(const std::string& filename, const std::vector<char>& data)
{
std::ofstream file;
file.open(filename.c_str(), std::ios::binary);
file.write(&data[0], data.size());
file.close();
}
Then you would also change your compress() and decompress() functions to work with std::vector<char>. Also note that so far the code is lacking any error handling. For example, what happens if the file doesn't exist? After calling file.open() you can check for any error by doing if (!file) { /* error handling */ }.
Apparently boost::asio::async_read doesn't like strings, as the only overload of boost::asio::buffer allows me to create const_buffers, so I'm stuck with reading everything into a streambuf.
Now I want to copy the contents of the streambuf into a string, but it apparently only supports writing to char* (sgetn()), creating an istream with the streambuf and using getline().
Is there any other way to create a string with the streambufs contents without excessive copying?
I don't know whether it counts as "excessive copying", but you can use a stringstream:
std::ostringstream ss;
ss << someStreamBuf;
std::string s = ss.str();
Like, to read everything from stdin into a string, do
std::ostringstream ss;
ss << std::cin.rdbuf();
std::string s = ss.str();
Alternatively, you may also use a istreambuf_iterator. You will have to measure whether this or the above way is faster - i don't know.
std::string s((istreambuf_iterator<char>(someStreamBuf)),
istreambuf_iterator<char>());
Note that someStreamBuf above is meant to represent a streambuf*, so take its address as appropriate. Also note the additional parentheses around the first argument in the last example, so that it doesn't interpret it as a function declaration returning a string and taking an iterator and another function pointer ("most vexing parse").
It's really buried in the docs...
Given boost::asio::streambuf b, with size_t buf_size ...
boost::asio::streambuf::const_buffers_type bufs = b.data();
std::string str(boost::asio::buffers_begin(bufs),
boost::asio::buffers_begin(bufs) + buf_size);
Another possibility with boost::asio::streambuf is to use boost::asio::buffer_cast<const char*>() in conjunction with boost::asio::streambuf::data() and boost::asio::streambuf::consume() like this:
const char* header=boost::asio::buffer_cast<const char*>(readbuffer.data());
//Do stuff with header, maybe construct a std::string with std::string(header,header+length)
readbuffer.consume(length);
This won't work with normal streambufs and might be considered dirty, but it seems to be the fastest way of doing it.
For boost::asio::streambuf you may find a solution like this:
boost::asio::streambuf buf;
/*put data into buf*/
std::istream is(&buf);
std::string line;
std::getline(is, line);
Print out the string :
std::cout << line << std::endl;
You may find here: http://www.boost.org/doc/libs/1_49_0/doc/html/boost_asio/reference/async_read_until/overload3.html
One can also obtain the characters from asio::streambuf using std::basic_streambuf::sgetn:
asio::streambuf in;
// ...
char cbuf[in.size()+1]; int rc = in.sgetn (cbuf, sizeof cbuf); cbuf[rc] = 0;
std::string str (cbuf, rc);
The reason you can only create const_buffer from std::string is because std::string explicitly doesn't support direct pointer-based writing in its contract. You could do something evil like resize your string to a certain size, then const_cast the constness from c_str() and treat it like a raw char* buffer, but that's very naughty and will get you in trouble someday.
I use std::vector for my buffers because as long as the vector doesn't resize (or you are careful to deal with resizing), you can do direct pointer writing just fine. If I need some of the data as a std::string, I have to copy it out, but the way I deal with my read buffers, anything that needs to last beyond the read callback needs to be copied out regardless.
I didn't see an existing answer for reading exactly n chars into a std::stringstream, so here is how that can be done:
std::stringstream ss;
boost::asio::streambuf sb;
const auto len = 10;
std::copy_n(boost::asio::buffers_begin(sb.data()), len,
std::ostream_iterator<decltype(ss)::char_type>(ss));
Compiler explorer
A simpler answer would be to convert it in std::string and manipulate it some what like this
std::string buffer_to_string(const boost::asio::streambuf &buffer)
{
using boost::asio::buffers_begin;
auto bufs = buffer.data();
std::string result(buffers_begin(bufs), buffers_begin(bufs) + buffer.size());
return result;
}
Giving a very concise code for the task.
I mostly don't like answers that say "You don't want X, you want Y instead and here's how to do Y" but in this instance I'm pretty sure I know what tstenner wanted.
In Boost 1.66, the dynamic string buffer type was added so async_read can directly resize and write to a string buffer.
I tested the first answer and got a compiler error when compiling using "g++ -std=c++11"
What worked for me was:
#include <string>
#include <boost/asio.hpp>
#include <sstream>
//other code ...
boost::asio::streambuf response;
//more code
std::ostringstream sline;
sline << &response; //need '&' or you a compiler error
std::string line = sline.str();
This compiled and ran.
I think it's more like:
streambuf.commit( number_of_bytes_read );
istream istr( &streambuf );
string s;
istr >> s;
I haven't looked into the basic_streambuf code, but I believe that should be just one copy into the string.