unbuffered std streambuf implementation - c++

for some quick testing of a serialization library I want to create a streambuf that can read/write to/from a socket. I do not want to use a buffer in the streambuf, but let the socket handle this. I am sure the serialization lib will only call std::istream::read and std::ostream::write. A quick look at Microsoft's basic_streambuf implementation shows that these calls are practically directly forwarded to xsputn and xsgetn.
The question is: can I derive from a streambuf and just implement xsputn and xsgetn, and be sure that the streams that use my implementation will always call these methods, and not sync/overflow/underflow/pback/... ? Or else should I override sync etc to return errors, or does the standard guarantee that the default implementations are fine? Preferrably this should work on any common platform, and I cannot use the boost::iostreams.
Practically I'd use something like this:
class socket_buf : public std::streambuf
{
public:
//Socket is a class with std::stream-like read/write methods
MyBuf( Socket& s ) : sock( s ) {}
protected:
std::streamsize xsputn( const char* s, std::streamsize n )
{
return sock.write( s, n );
}
std::streamsize xsgetn( char* s, std::streamsize n )
{
return sock.read( s, n );
}
private:
Socket& sock;
};

It's (almost?) impossible to implement a std::streambuf without a buffer. You will have to overload underflow and overflow as many of the public interfaces to std::streambuf won't go via xsputn or xsgetn. E.g. sputc, sbumpc, etc. Even sputn is not guaranteed to cause a call xsputn depending on the state of the internal buffer and the particular std::streambuf implementation.

Related

Wrapper class design and dependency injection

I have a simple FTP class that take care of downloading and uploading through cURL libraries:
class FTPClient
{
public:
explicit FTPClient(const std::string& strIPAddress);
virtual ~FTPClient();
bool DownloadFile(const std::string& strRemoteFile, const std::string& strLocalFile);
bool UploadFile(const std::string& strLocalFile, const std::string& strRemoteFile);
private:
static size_t WriteToFileCallBack(void *ptr, size_t size, size_t nmemb, FILE *stream);
static size_t ReadFromFileCallback(void* ptr, size_t size, size_t nmemb, FILE *stream);
std::string m_strUser;
std::string m_strPass;
std::string m_strIPAddress;
std::string m_strPort;
mutable CURL* m_objCurlSession;
};
I've asked some advices on how it could be implemented and structured better since it's the base and core for a project and it's going to be used in many parts.
I've been told to use a cURLWrapper class to wrap all of the cURL calls (curl_easy_setopt(..)), but then I've been told to create an Interface for the FTP class, a cURLWrapper that just calls the FTP methods and then a concrete class.. but still it's too abstract for me and don't understand the best way to implement it and which path to follow..
How would you approach this small structure?
Define a simple interface for your FTP class:
class IFTPClient
{
public:
virtual ~IFTPClient();
virtual bool DownloadFile(const std::string& strRemoteFile, const std::string& strLocalFile) = 0;
virtual bool UploadFile(const std::string& strLocalFile, const std::string& strRemoteFile) = 0;
};
I assume your static callback methods are calling back into some class instance rather than into a singleton? That's fine then. Derive your class from the interface:
class FTPClient:IFTPClient
{
...
I notice that you have the IP address passed into the constructor and have other parameters (user name, password, port, etc.) defined elsewhere. That does not appear to be quite consistent yet. You would need to refactor that so that these parameters can be set through interface methods or add those to the upload/download methods.
Construct your FTPClient object(s) before you need it elsewhere and then only pass ("inject") the interface into objects that want to use the FTPClient. For unit testing without the use of the actual FTPClient, construct a mock object derived from the same interface and inject that into other objects instead.
Other objects simply make use of the functionality exposed in the interface and don't need to know or worry about its internal implementation; if you decide to use curl or something else is then entirely up to FTPClient.
That's in in a nutshell; you may want to search for dependency injection and frameworks on the Internet but you don't need a framework to follow dependency injection principles and, in my opinion, they can be overkill for simple projects.

converting text sent to std::ostream

I have a function that sends stuff to std::ostream.
Something like this
void f(std::ostream& oss) {
oss << "Hello world!";
}
Now, I would like to create my own class that derives from std::ostream that would be parsing outgoing text, and changing it, so it would print "Ola world!", for example.
class StreamConverted : public std::ostream {
...
};
I believe (I am not too experienced with stream manipulation) that I would have to 'play' with the underlying rdbuf() of the stream, so I would have to substitute rdbuf of std::ostream with mine.
MyStreamBuf m_my_streambuf;
std::ostream& m_original_stream
std::streambuf* m_original_streambuf;
public:
StreamConverted(std::ostream& os)
: m_original_stream(os)
, m_original_streambuf(os.rdbuf(&m_my_streambuf))
{}
(Please, forgive me any obvious mistakes or typos. I am writing this all on the flight. I would also add destructor to restore original streambuf.)
This leads me to the need to write my MyStreamBuf that would derive from std::streambuf
class MyStreamBuf : public std::streambuf {
};
And here comes the moment when I need an advice.
Should I create my own buffer by calling std::streambuf::setp(begin, end) and then overwrite the overflow() method to parse the contents of the buffer when it is called, and then send the data out to the original streambuf after 'in some reasonable way' transforming the buffer?
I am not sure if I am going to far with modifying the buffer instead of doing something with ostream...
Any advice?

Constant and Overloaded Constructor

I have a class that is primarily used to "structuralized" buffer. One client generally use to write and the other used to read. For the writing, there are default values that the class would set, but for read, it should leave it alone.
class Formatter
{
public:
//! Used by writer
Formatter( unsigned char* Buffer ) :
m_Buffer( Buffer )
{
Buffer[ 0 ] = 1; //say this is the format
}
//! Used by reader
Formatter( const unsigned char* Buffer ) :
m_Buffer( Buffer )
{
}
//...Other methods returns pointer to structure
private:
unsigned char* m_Buffer;
};
The problem here is that it is easy for a reader to make the mistake by passing in a non-const buffer.
//..assume pBuffer is non-const
//We really want to read
const Formatter myFormatter( pBuffer );
//We really want const Formatter myFormatter( const_cast<const unsigned char*>(pBuffer) );
I can't really think if a nice way to prevent user from making this mistake, without having the user being explicit.
Anyone know of a nice trick?
Thanks in advance.
struct writer_access {};
Formatter( writer_access, unsigned char* Buffer ) :
here we tag the constructor with different semantics with a tag. Accidental use should be nearly impossible. You call it by passing (writer_access{}, pbuff)
Two tricks come to my mind:
Make Formatter abstract and provide subclasses of it, one for reading, one for writing:
class ReadFormatter : public Formatter {
public:
ReadFormatter(const unsigned char*);
};
class WriteFormatter : public Formatter {
public:
WriteFormatter(unsigned char*);
};
Create two static member functions for the two tasks:
static Formatter ReadFormatter(…) { … }
static Formatter WriteFormatter(…) { … }
IMO subclassing would be the better way.

Splitting a file and passing the data on to other classes

In my current project, I have a lot of binary files of different formats. Several of them act as simple archives, and therefore I am trying to come up with a good approach for passing extracted file data on to other classes.
Here's a simplified example of my current approach:
class Archive {
private:
std::istream &fs;
void Read();
public:
Archive(std::istream &fs); // Calls Read() automatically
~Archive();
const char* Get(int archiveIndex);
size_t GetSize(int archiveIndex);
};
class FileFormat {
private:
std::istream &fs;
void Read();
public:
FileFormat(std::istream &fs); // Calls Read() automatically
~FileFormat();
};
The Archive class basically parses the archive and reads the stored files into char pointers.
In order to load the first FileFormat file from an Archive, I would currently use the following code:
std::ifstream fs("somearchive.arc", std::ios::binary);
Archive arc(fs);
std::istringstream ss(std::string(arc.Get(0), arc.GetSize(0)), std::ios::binary);
FileFormat ff(ss);
(Note that some files in an archive could be additional archives but of a different format.)
When reading the binary data, I use a BinaryReader class with functions like these:
BinaryReader::BinaryReader(std::istream &fs) : fs(fs) {
}
char* BinaryReader::ReadBytes(unsigned int n) {
char* buffer = new char[n];
fs.read(buffer, n);
return buffer;
}
unsigned int BinaryReader::ReadUInt32() {
unsigned int buffer;
fs.read((char*)&buffer, sizeof(unsigned int));
return buffer;
}
I like the simplicity of this approach but I'm currently struggling with a lot of memory errors and SIGSEGVs and I'm afraid that it's because of this method. An example is when I create and read an archive repeatedly in a loop. It works for a large number of iterations, but after a while, it starts reading junk data instead.
My question to you is if this approach is feasible (in which case I ask what I am doing wrong), and if not, what better approaches are there?
The flaws of code in the OP are:
You are allocating heap memory and returning a pointer to it from one of your functions. This may lead to memory leaks. You have no problem with leaks (for now) but you must have such stuff in mind while designing your classes.
When dealing with Archive and FileFormat classes user always has to take into account the internal structure of your archive. Basically it compromises the idea of data incapsulation.
When user of your class framework creates an Archive object, he just gets a way to extract a pointer to some raw data. Then the user must pass this raw data to completely independent class. Also you will have more than one kind of FileFormat. Even without the need to watch for leaky heap allocations dealing with such system will be highly error-prone.
Lets try to apply some OOP principles to the task. Your Archive object is a container of Files of different format. So, an Archive's equivalent of Get() should generally return File objects, not a pointer to raw data:
//We gonna need a way to store file type in your archive index
enum TFileType { BYTE_FILE, UINT32_FILE, /*...*/ }
class BaseFile {
public:
virtual TFileType GetFileType() const = 0;
/* Your abstract interface here */
};
class ByteFile : public BaseFile {
public:
ByteFile(istream &fs);
virtual ~ByteFile();
virtual TFileType GetFileType() const
{ return BYTE_FILE; }
unsigned char GetByte(size_t index);
protected:
/* implementation of data storage and reading procedures */
};
class UInt32File : public BaseFile {
public:
UInt32File(istream &fs);
virtual ~UInt32File();
virtual TFileType GetFileType() const
{ return UINT32_FILE; }
uint32_t GetUInt32(size_t index);
protected:
/* implementation of data storage and reading procedures */
};
class Archive {
public:
Archive(const char* filename);
~Archive();
BaseFile* Get(int archiveIndex);
{ return (m_Files.at(archiveIndex)); }
/* ... */
protected:
vector<BaseFile*> m_Files;
}
Archive::Archive(const char* filename)
{
ifstream fs(filename);
//Here we need to:
//1. Read archive index
//2. For each file in index do something like:
switch(CurrentFileType) {
case BYTE_FILE:
m_Files.push_back(new ByteFile(fs));
break;
case UINT32_FILE:
m_Files.push_back(new UInt32File(fs));
break;
//.....
}
}
Archive::~Archive()
{
for(size_t i = 0; i < m_Files.size(); ++i)
delete m_Files[i];
}
int main(int argc, char** argv)
{
Archive arch("somearchive.arc");
BaseFile* pbf;
ByteFile* pByteFile;
pbf = arch.Get(0);
//Here we can use GetFileType() or typeid to make a proper cast
//An example of former:
switch ( pbf.GetFileType() ) {
case BYTE_FILE:
pByteFile = dynamic_cast<ByteFile*>(pbf);
ASSERT(pByteFile != 0 );
//Working with byte data
break;
/*...*/
}
//alternatively you may omit GetFileType() and rely solely on C++
//typeid-related stuff
}
Thats just a general idea of the classes that may simplify the usage of archives in your application.
Have in mind though that good class design may help you with memory leaks prevention, code clarification and such. But whatever classes you have you will still deal with binary data storage problems. For example, if your archive stores 64 bytes of byte data and 8 uint32's and you somehow read 65 bytes instead of 64, the reading of the following ints will give you junk. You may also encounter alignment and endianness problems (the latter is important if you applications are supposed to run on several platforms). Still, good class design may help you to produce a better code which addresses such problems.
It is asking for trouble to pass a pointer from your function and expect the user to know to delete it, unless the function name is such that it is obvious to do so, e.g. a function that begins with the word create.
So
Foo * createFoo();
is likely to be a function that creates an object that the user must delete.
A preferable solution would, for starters, be to return std::vector<char> or allow the user to pass std::vector<char> & to your function and you write the bytes into it, setting its size if necessary. (This is more efficient if doing multiple reads where you can reuse the same buffer).
You should also learn const-correctness.
As for your "after a while it fills with junk", where do you check for end of file?

C++ design - Network packets and serialization

I have, for my game, a Packet class, which represents network packet and consists basically of an array of data, and some pure virtual functions
I would then like to have classes deriving from Packet, for example: StatePacket, PauseRequestPacket, etc. Each one of these sub-classes would implement the virtual functions, Handle(), which would be called by the networking engine when one of these packets is received so that it can do it's job, several get/set functions which would read and set fields in the array of data.
So I have two problems:
The (abstract) Packet class would need to be copyable and assignable, but without slicing, keeping all the fields of the derived class. It may even be possible that the derived class will have no extra fields, only function, which would work with the array on the base class. How can I achieve that?
When serializing, I would give each sub-class an unique numeric ID, and then write it to the stream before the sub-class' own serialization. But for unserialization, how would I map the read ID to the appropriate sub-class to instanciate it?
If anyone want's any clarifications, just ask.
-- Thank you
Edit: I'm not quite happy with it, but that's what I managed:
Packet.h: http://pastebin.com/f512e52f1
Packet.cpp: http://pastebin.com/f5d535d19
PacketFactory.h: http://pastebin.com/f29b7d637
PacketFactory.cpp: http://pastebin.com/f689edd9b
PacketAcknowledge.h: http://pastebin.com/f50f13d6f
PacketAcknowledge.cpp: http://pastebin.com/f62d34eef
If someone has the time to look at it and suggest any improvements, I'd be thankful.
Yes, I'm aware of the factory pattern, but how would I code it to construct each class? A giant switch statement? That would also duplicade the ID for each class (once in the factory and one in the serializator), which I'd like to avoid.
For copying you need to write a clone function, since a constructor cannot be virtual:
virtual Packet * clone() const = 0;
Which each Packet implementation implement like this:
virtual Packet * clone() const {
return new StatePacket(*this);
}
for example for StatePacket. Packet classes should be immutable. Once a packet is received, its data can either be copied out, or thrown away. So a assignment operator is not required. Make the assignment operator private and don't define it, which will effectively forbid assigning packages.
For de-serialization, you use the factory pattern: create a class which creates the right message type given the message id. For this, you can either use a switch statement over the known message IDs, or a map like this:
struct MessageFactory {
std::map<Packet::IdType, Packet (*)()> map;
MessageFactory() {
map[StatePacket::Id] = &StatePacket::createInstance;
// ... all other
}
Packet * createInstance(Packet::IdType id) {
return map[id]();
}
} globalMessageFactory;
Indeed, you should add check like whether the id is really known and such stuff. That's only the rough idea.
You need to look up the Factory Pattern.
The factory looks at the incomming data and created an object of the correct class for you.
To have a Factory class that does not know about all the types ahead of time you need to provide a singleton where each class registers itself. I always get the syntax for defining static members of a template class wrong, so do not just cut&paste this:
class Packet { ... };
typedef Packet* (*packet_creator)();
class Factory {
public:
bool add_type(int id, packet_creator) {
map_[id] = packet_creator; return true;
}
};
template<typename T>
class register_with_factory {
public:
static Packet * create() { return new T; }
static bool registered;
};
template<typename T>
bool register_with_factory<T>::registered = Factory::add_type(T::id(), create);
class MyPacket : private register_with_factory<MyPacket>, public Packet {
//... your stuff here...
static int id() { return /* some number that you decide */; }
};
Why do we, myself included, always make such simple problems so complicated?
Perhaps I'm off base here. But I have to wonder: Is this really the best design for your needs?
By and large, function-only inheritance can be better achieved through function/method pointers, or aggregation/delegation and the passing around of data objects, than through polymorphism.
Polymorphism is a very powerful and useful tool. But it's only one of many tools available to us.
It looks like each subclass of Packet will need its own Marshalling and Unmarshalling code. Perhaps inheriting Packet's Marshalling/Unmarshalling code? Perhaps extending it? All on top of handle() and whatever else is required.
That's a lot of code.
While substantially more kludgey, it might be shorter & faster to implement Packet's data as a struct/union attribute of the Packet class.
Marshalling and Unmarshalling would then be centralized.
Depending on your architecture, it could be as simple as write(&data). Assuming there are no big/little-endian issues between your client/server systems, and no padding issues. (E.g. sizeof(data) is the same on both systems.)
Write(&data)/read(&data) is a bug-prone technique. But it's often a very fast way to write the first draft. Later on, when time permits, you can replace it with individual per-attribute type-based Marshalling/Unmarshalling code.
Also: I've taken to storing data that's being sent/received as a struct. You can bitwise copy a struct with operator=(), which at times has been VERY helpful! Though perhaps not so much in this case.
Ultimately, you are going to have a switch statement somewhere on that subclass-id type. The factory technique (which is quite powerful and useful in its own right) does this switch for you, looking up the necessary clone() or copy() method/object.
OR you could do it yourself in Packet. You could just use something as simple as:
( getHandlerPointer( id ) ) ( this )
Another advantage to an approach this kludgey (function pointers), aside from the rapid development time, is that you don't need to constantly allocate and delete a new object for each packet. You can re-use a single packet object over and over again. Or a vector of packets if you wanted to queue them. (Mind you, I'd clear the Packet object before invoking read() again! Just to be safe...)
Depending on your game's network traffic density, allocation/deallocation could get expensive. Then again, premature optimization is the root of all evil. And you could always just roll your own new/delete operators. (Yet more coding overhead...)
What you lose (with function pointers) is the clean segregation of each packet type. Specifically the ability to add new packet types without altering pre-existing code/files.
Example code:
class Packet
{
public:
enum PACKET_TYPES
{
STATE_PACKET = 0,
PAUSE_REQUEST_PACKET,
MAXIMUM_PACKET_TYPES,
FIRST_PACKET_TYPE = STATE_PACKET
};
typedef bool ( * HandlerType ) ( const Packet & );
protected:
/* Note: Initialize handlers to NULL when declared! */
static HandlerType handlers [ MAXIMUM_PACKET_TYPES ];
static HandlerType getHandler( int thePacketType )
{ // My own assert macro...
UASSERT( thePacketType, >=, FIRST_PACKET_TYPE );
UASSERT( thePacketType, <, MAXIMUM_PACKET_TYPES );
UASSERT( handlers [ thePacketType ], !=, HandlerType(NULL) );
return handlers [ thePacketType ];
}
protected:
struct Data
{
// Common data to all packets.
int number;
int type;
union
{
struct
{
int foo;
} statePacket;
struct
{
int bar;
} pauseRequestPacket;
} u;
} data;
public:
//...
bool readFromSocket() { /*read(&data); */ } // Unmarshal
bool writeToSocket() { /*write(&data);*/ } // Marshal
bool handle() { return ( getHandler( data.type ) ) ( * this ); }
}; /* class Packet */
PS: You might dig around with google and grab down cdecl/c++decl. They are very useful programs. Especially when playing around with function pointers.
E.g.:
c++decl> declare foo as function(int) returning pointer to function returning void
void (*foo(int ))()
c++decl> explain void (* getHandler( int ))( const int & );
declare getHandler as function (int) returning pointer to function (reference to const int) returning void