How do I implement seekg() for a custom istream/streambuf? - c++

I used to be a C++ expert a decade ago, but for the past 10 years I've been programming Java. I just started a C++ project that uses a small third-party XML parser. The XML parser accepts an STL istream. My XML data is coming from a Windows COM IStream. I thought I'd do the Right Thing and create an adapter to take the IStream data and present it to the XML parser through an istream.
I followed the excellent tutorial at http://www.mr-edd.co.uk/blog/beginners_guide_streambuf and created a COMStreambuf that takes data from the underlying COM IStream, and used it as a buffer for a custom COMIstream. Everything looks good, but I get a read error from the parser.
Turns out the parser reads the whole file into memory by using seekg() on the istream to find out its size and then goes back to the beginning with seekg() to read it in one go. Unsurprisingly, the aforementioned tutorial decided to "save [instructions on implementing seeking] for another post" which was apparently never written.
Can someone tell me what I need to do to implement seekg() with my custom istream/streambuf? I would venture out doing it myself (my first inclination would be to override stuff in istream), but with my inexperience this deep in the STL and with my Java mentality I fear I would do something incomplete and have a brittle solution. (Without reading tutorials, for example, I never would have guessed that one makes a custom istream by writing a new streambuf, for example, or that I would need to call imbue() with a default locale, etc.)
Thanks for any help. I've been very impressed with this site---both with the knowledge of the participants and their friendly, honest nature in conceding who has the best answer. :)

I assume that by "seekg" you mean seekoff and seekpos.
The straightforward way to implement members seekoff and seekpos of your COMStreambuf is to wrap the Seek method of the IStream interface. For example, something like this should work:
// COMStreambuf.cpp
COMStreambuf::pos_type COMStreambuf::seekoff(COMStreambuf::off_type off_, std::ios_base::seekdir way_, std::ios_base::openmode which_)
{
union {
LARGE_INTEGER liMove;
ULARGE_INTEGER uliMove;
};
liMove.QuadPart = off_;
DWORD dwOrigin = STREAM_SEEK_SET;
if (way_ == std::ios_base::cur) {
dwOrigin = STREAM_SEEK_CUR;
} else if (way_ == std::ios_base::end) {
dwOrigin = STREAM_SEEK_END;
} else {
assert(way_ == std::ios_base::beg);
dwOrigin = STREAM_SEEK_SET;
uliMove.QuadPart = off_;
}
ULARGE_INTEGER uliNewPosition;
if (which_ & std::ios_base::in) {
if (which_ & std::ios_base::out)
return pos_type(off_type(-1));
HRESULT hres = streamIn->Seek(liMove, dwOrigin, &uliNewPosition);
if (hres != S_OK)
return pos_type(off_type(-1));
setg(eback(), egptr(), egptr());
} else if (which_ & std::ios_base::out) {
HRESULT hres = streamOut->Seek(liMove, dwOrigin, &uliNewPosition);
if (hres != S_OK)
return pos_type(off_type(-1));
setp(pbase(), epptr(), epptr());
} else {
return pos_type(off_type(-1));
}
return pos_type(uliNewPosition.QuadPart);
}
COMStreambuf::pos_type COMStreambuf::seekpos(COMStreambuf::pos_type sp_, std::ios_base::openmode which_)
{
return seekoff(off_type(sp_), std::ios_base::beg, which_);
}
In this listing, after setting the position of streamIn I call:
setg(eback(), egptr(), egptr());
After a seek, sputbackc or sungetc will operate on old data. You may want to consider whether this makes sense for your application and do something different.

Related

How to get stream object identified by a FILE?

Documentation states that FILE is object type that identifies a stream. So, is it possible to get the stream object associated with a FILE?
For example, I'd like to get std::cout object from stdout FILE pointer, or std::cerr from stderr etc. More generally I want to write a function that redirects a given stream and sets the custom streambuf to it, something like this:
void redirect(FILE* file, std::ios stream) {
freopen_s((FILE**)file, "CONOUT$", "w", file);
stream.rdbuf(customBuffer);
}
used to redirect streams
redirect(stdout, std::cout);
redirect(stderr, std::cerr);
It seems redundant to have 2 parameters, since both parameters are always associated with each other.
The C++ standard library includes the C standard library. A FILE is a C stream, which is quite a different animal than a C++ iostream. It is possible for an std::stream implementation to rely of an underlying FILE, but this is not required by the standard, and even in that case there is no way to retrieve it.
What is possible is to build a custom std::streambuf that explicitly uses an underlying FILE *, and use it in a std::stream. std::basic_streambuf is one of the few classes from the C++ standard library that is explicitely designed as a base class for custom derivation. Unfortunately I could not find a tutorial for it, but the class contains a number of virtual methods that you just have to override. It is not exactly an easy path, but is possible with some works, heavy testing, and eventually some help from SO if you get stuck somewhere. But a full implementation is far beyond a SO answer.
TL/DR: there is no underlying std::stream associated with a FILE but with some work you can build a custom stream_buffer that will use an underlying FILE *. Though those are rather advanced operations...
While it is not possible to cleanly do this in C++ you could do something like this.
FILE * file = popen("someFile")
const unsigned BUFF = 2048;
string total;
bool done = false;
while (!done) {
vector<char> cBuf[BUFF];
size_t read = fread((void *)&cBuf[0], 1, BUFF, f);
if (read)
{
total.append(cBuf.begin(), cBuf.end());
}
if (read < BUFF)
{
done = true;
}
}
pclose(f);
istringstream filey(total);
Hope this helps.

C++ few istream pointer and refferecnes and refactoring

The function below works, but it's seems to me it has very bad smell.
My project communicate with device over HTTP, it has some url with digest authentication, some pages without.
Some url compressed with deflate, some none.
So my function has 3 different way to get istream.
And i need to read istream in one place in the bottom of function.
But as said good people from another my question C++ variable visable scopes and strems, pointers in this case is bad.
And in this code in some cases creates dynamic object.
Poco::InflatingInputStream* inflater = new Poco::InflatingInputStream(*respStreamPtr);
And this is path to memory leaks?
If create inflater without new statement, then *respStreamPtr has no data out of if block scope.
So, please give me advice how to refactor this code in right way.
std::ostream& requestStream = session->sendRequest(request);
istream* respStreamPtr;
respStreamPtr = &session->receiveResponse(response);
if (response.getStatus() == HTTPResponse::HTTP_UNAUTHORIZED)
{
credentials->authenticate(request, response);
session->sendRequest(request);
respStreamPtr = &session->receiveResponse(response);
}
if (response.has("Content-Encoding") && response.get("Content-Encoding") == "deflate") {
Poco::InflatingInputStream* inflater = new Poco::InflatingInputStream(*respStreamPtr);
respStreamPtr = &std::istream(inflater->rdbuf());
}
std::ostringstream stringStream;
stringStream << respStreamPtr->rdbuf();
responseBody = stringStream.str();
Yes, that will be a memory leak, as every time you use new it will create a new object, which you never delete
However, you use the inflater variable outside of the scope it's declared in, so currently have no way of nicely deleting it without modifying other code. For a simple fix, you could just declare Poco::InflatingInputStream* inflater = nullptr; at the top of your code, then delete it at the end.
I'd strongly recommend reading up on how to manage memory correctly in c++, and even taking a look at smart pointers (but not without learning the fundamentals first)
Poco actually has functionality which does a lot of what you're trying to do, so your example could be easily condensed (disclaimer: untested):
session->sendRequest(request);
auto& responseStream = session->receiveResponse(response);
if (response.getStatus() == HTTPResponse::HTTP_UNAUTHORIZED)
{
credentials->authenticate(request, response);
session->sendRequest(request);
responseStream = session->receiveResponse(response);
}
if (response.has("Content-Encoding") && response.get("Content-Encoding") == "deflate")
{
Poco::InflatingInputStream inflater(responseStream);
StreamCopier::copyStream(inflater, responseStream);
}
responseBody << responseStream;

Can ICU perform collation comparisons on UTF-16LE data on big endian machines directly?

I have the following code:
UCharIterator iter1;
UCharIterator iter2;
UErrorCode status = U_ZERO_ERROR;
if (ENC_UTF16_BE == m_encoding)
{
uiter_setUTF16BE(&iter1, reinterpret_cast<const char*>(in_string1), in_length1);
uiter_setUTF16BE(&iter2, reinterpret_cast<const char*>(in_string2), in_length2);
return ucol_strcollIter(m_collator, &iter1, &iter2, &status);
}
else if (ENC_UTF8 == m_encoding)
{
uiter_setUTF8(&iter1, reinterpret_cast<const char*>(in_string1), in_length1);
uiter_setUTF8(&iter2, reinterpret_cast<const char*>(in_string2), in_length2);
return ucol_strcollIter(m_collator, &iter1, &iter2, &status);
}
else
{
UnicodeString s1(reinterpret_cast<const char*>(in_string1), in_length1);
UnicodeString s2(reinterpret_cast<const char*>(in_string2), in_length2);
return ucol_strcoll(m_collator, s1.getBuffer(), s1.length(), s2.getBuffer(), s2.length());
}
Now, this follows the 'happy path' where the encoding of the data matches ICU's internal encoding, which, on little-endian systems, is UTF16-LE.
But, if this were compiled on a big-endian system, and the encoding was UTF16-LE, we would be forced to go to the 'general' case, which involves creating a UnicodeString object, along with the implied conversion.
It seems like there should be a uiter_setUTF16LE function for this case, but there isn't? Is this an artifact of ICU being always UTF16-LE internally in the far past? Is there another way of doing this, or am I forced to copy/convert?
It looks like I could implement my own 'subclass' of UCharIterator to do this. It seems unfortunate that I would need to do this for something which seems like a relatively common case.

check if something was serialized in std::ostream

Is there an easy way to check if something was serialized in stl::ostream. I am looking for something like:
some preparation
// ... a very complex code that may result in adding or not to the stream,
// that I will prefer not to change
check if the stream has something added
Note that this will need to works recursively. Is using register_callback is a good idea or there is easier way?
First the immediate question: register_callback() is intended to deal with appropriate copying and releasing of resources stored in pword() and will have operations only related to that (i.e., copying, assigning, and releasing plus observing std::locale changes). So, no, that won't help you at all.
What you can do, however, is to create a filtering stream buffer which observes if there was a write to the stream, e.g., something like this:
class changedbuf: std::streambuf {
std::streambuf* d_sbuf;
bool d_changed;
int_type overflow(int_type c) {
if (!traits_type::eq_int_type(c, traits_type::eof())) {
this->d_changed = true;
}
return this->d_sbuf->sputc(c);
}
public:
changedbuf(std::streambuf* sbuf): d_sbuf(d_sbuf), d_changed() {}
bool changed() const { return this->d_changed; }
}
You can use this in place of the std::ostream you already have, e.g.:
void f(std::ostream& out) {
changedbuf changedbuf(out.rdbuf());
std::ostream changedout(&changedbuf);
// use changedout instead of out; if you need to use a global objects, you'd
// replace/restore the used stream buffer using the version of rdbuf() taking
// an argument
if (changedbuf.change()) {
std::cout << "there was a change\n";
}
}
A real implementation would actually provide a buffer and deal with proper flushing (i.e., override sync()) and sequence output (i.e., override xsputn()). However, the above version is sufficient as a proof-of-concept.
Others are likely to suggest the use of std::ostringstream. Depending on the amount of data written, this can easily become a performance hog, especially compared to an advanced version of changedbuf which appropriately deals with buffering.
Are you passing the stream into the complex code, or is it globally visible? Can it be any kind of ostream or can you constrain the type to ofstream or ostringstream?
You may be able to use tellp to determine whether the file position has changed since your preparation code, if your ostream type supports it (such as with most fstreams). Or, if you're passing the stream in, you could pass an empty ostringstream in and check that it's not empty when the string is extracted to be printed out.
It's not entirely obvious which solution, if any, would be appropriate for you without knowing more about the context of your code and the specifics of your problem. The best answer may be to return (or set as a by-reference out-parameter) a flag indicating whether the stream was inserted into.

How to get the length of IStream? C++

I'm creating an IStream as follow:
IStream* stream;
result = CreateStreamOnHGlobal(0, TRUE, &stream);
Then I have a CImage object that I save to this stream:
image->Save(stream, Gdiplus::ImageFormatBMP);
I need to get the size of bytes written to this IStream.
How can I do this?
There is no Length or something like this in the IStream...
thanks!
IStream::Stat should do what you want.
Or you can use:
ULARGE_INTEGER liSize;
IStream_Size(pStream, &liSize);
other functions you might find useful in this context:
IStream_Reset(pStream); // reset seek position to beginning
IStream_Read(pStream, mem, size);
Both IStream_Size as well as IStream::Stat can be used to request the size. IStream_Size appears to be a convenience wrapper around IStream::Stat (that's oddly only available as a C COM macro). If that is indeed the case then there's a lot of data queried: An entire STATSTG, optionally without the pwcsName member.
In that case, a less costly way to get the same information would be IStream::Seek:
HRESULT get_size(IStream* stream, ULARGE_INTEGER& size) {
return IStream->Seek({}, STREAM_SEEK_END, &size);
}
This changes the stream's current read or write pointer. If you need to save the current position you can use the following:
ULARGE_INTEGER current{};
stream->Seek({}, STREAM_SEEK_CUR, &current);