C++ object that modifies itself in memory - c++

A friend of mine and some other guy have written the following code that according to my C++ knowledge should be very dangerous:
Recovery& Recovery::LoadRecoverFile() {
fstream File("files/recover.dat", ios::in | ios::binary);
if (File.is_open()) {
while (!File.eof()) {
File.read(reinterpret_cast<char*>(this), sizeof(Recovery)); // <----- ?
}
}
File.close();
return *this; // <----- ?
}
Could you give me your opinion why this is bad and how should it be done correctly?
They basically write an object of class Recovery to a file and when required they read it in with the above method.
Edit:
Just to give some additional information about the code. This is what class Recovery contains.
class Recovery {
public:
Recovery();
virtual ~Recovery();
void createRecoverFile();
void saveRecoverFile( int level, int win, int credit, gameStates state, int clicks );
Recovery& LoadRecoverFile();
const vector<Card>& getRecoverCards() const;
void setRecoverCards(const vector<Card>& recoverCards);
int getRecoverClicks() const;
void setRecoverClicks(int recoverClicks);
int getRecoverCredit() const;
void setRecoverCredit(int recoverCredit);
int getRecoverLevel() const;
void setRecoverLevel(int recoverLevel);
gameStates getRecoverState() const;
void setRecoverState(gameStates recoverState);
int getRecoverTime() const;
void setRecoverTime(int recoverTime);
int getRecoverWin() const;
void setRecoverWin(int recoverWin);
private:
int m_RecoverLevel;
int m_RecoverCredit;
gameStates m_RecoverState;
};
This saves the object to a file:
void Recovery::saveRecoverFile(int level, int win, int credit, gameStates state,
int clicks) {
m_RecoverLevel = level;
m_RecoverCredit = credit;
m_RecoverState = state;
ofstream newFile("files/recover.dat", ios::binary | ios::out);
if (newFile.is_open()) {
newFile.write(reinterpret_cast<char*>(this), sizeof(Recovery));
}
newFile.close();
}
That's how it is used:
m_Recovery.LoadRecoverFile();
credit.IntToTextMessage(m_Recovery.getRecoverCredit());
level.IntToTextMessage(m_Recovery.getRecoverLevel());
m_cardLogic.setTempLevel(m_Recovery.getRecoverLevel());
Timer::g_Timer->StartTimer(m_Recovery.getRecoverLevel() + 3);

It probably is undefined behavior (unless Recovery is a POD made only of scalar fields).
It probably won't work if the Recovery class has a vtable, unless perhaps the process which is reading is the same process which wrote it. Vtables contain function pointers (usually, addresses of some machine code). And these function pointers would vary from one process to another one (even if they are running the same binary), e.g. because of ASLR.
It also won't work if Recovery contains other objects (e.g. a std::vector<std::shared_ptr<Recovery>> ...., or your gameStates), because these sub-objects won't be constructed correctly.
It could work sometimes. But what you apparently are looking for is serialization (then I would suggest using a textual format like JSON, but see also libs11n) or application checkpointing. You should design your application with those goals from the very start.

It really depends on what a Recovery object contains. If it contains pointers to data, open resource descriptors and things like that you will not be able to store those on a file in a meaningful way. Restoring a pointer in this way may set its value, but the value it pointed will most certainly not be where you expect it to be anymore.
If Recovery is a POD this should work.
You may want to look at this question and this other question, which are similar to yours.
As Galik correctly points out, using
while (!File.eof()) {
doesn't make much sense. Instead, you should use
if ( File.read(/* etc etc */) ) {
// Object restored successfully.
}
else {
// Revert changes and signal that object was not loaded.
}
The caller of the function needs to have a way to know if the loading was successful. The method is already a member function, so a better definition could be:
/* Returns true if the file was read successfully, false otherwise.
* If reading fails the previous state of the object is not modified.
*/
bool Recovery::LoadRecoverFile(const std::string & filename);

Personally I would recommend storing the game state in text format rather than binary. Binary data like this is non-portable, sometimes even between different versions of the same compiler on the same computer or even using different compiler configuration options.
That being said if you are going the binary route (or not) the main problem I see with the code is lack of error checking. And the whole idea of getting a Recovery object to hoist itself by its own petard make error checking very difficult.
I have knocked up something I think is more robust. I don't know the proper program structure you are using so this probably won't match what you need. But it may serve as an example of how this can be approached.
Most importantly always check for errors, report them where appropriate and return them to the caller.
enum gameStates
{
MENU, STARTGAME, GAMEOVER, RECOVERY, RULES_OF_GAMES, VIEW_CARDS, STATISTIC
};
const std::string RECOVER_FILE = "files/recover.dat";
struct Recovery
{
int m_RecoverLevel;
int m_RecoverCredit;
gameStates m_RecoverState;
};
struct WhateverClass
{
Recovery m_Recovery;
bool LoadRecoverFile(Recovery& rec);
public:
bool recover();
};
// Supply the Recover object to be restored and
// return true or false to know it succeeded or not
bool WhateverClass::LoadRecoverFile(Recovery& rec)
{
std::ifstream file(RECOVER_FILE, std::ios::binary);
if(!file.is_open())
{
log("ERROR: opening the recovery file: " << RECOVER_FILE);
return false;
}
if(!file.read(reinterpret_cast<char*>(&rec), sizeof(Recovery)))
{
log("ERROR: reading from recovery file: " << RECOVER_FILE);
return false;
}
return true;
}
bool WhateverClass::recover()
{
if(!LoadRecoverFile(m_Recovery))
return false;
credit.IntToTextMessage(m_Recovery.getRecoverCredit());
level.IntToTextMessage(m_Recovery.getRecoverLevel());
m_cardLogic.setTempLevel(m_Recovery.getRecoverLevel());
Timer::g_Timer->StartTimer(m_Recovery.getRecoverLevel() + 3);
return true;
}
Hope this helps.

Hi everyone Actually class StateManager content integers:
#ifndef STATEMANAGER_H_
#define STATEMANAGER_H_
enum gameStates {
MENU, STARTGAME, GAMEOVER, RECOVERY, RULES_OF_GAMES, VIEW_CARDS, STATISTIC
};
class StateManager {
public:
static StateManager* stateMachine;
StateManager();
virtual ~StateManager();
gameStates getCurrentGameStates() const;
void setCurrentGameStates(gameStates currentGameStates);
private:
gameStates m_currentGameStates;
};
#endif /* STATEMANAGER_H_ */

Related

Class composition with public member objects

I am trying to figure out the best way to design my program features.
A major component of the program is a Camera class. This Camera object represents the program user interface to a real camera, which interfaces to a computer through a frame grabber card. The camera class can link to a frame grabber, start and stop acquisition, and also mutate/access many different camera properties. When I say many, I'm talking about over 250 unique commands. Each unique command is issued to the camera by sending a serial string through the framegrabber to the physical camera. Each command can be thought of as one of three types. An action, a query, and a value.
An action command is something that doesn't require an equals sign, for example "reset", "open", "close"
A query is something that you can get, but not set, that is usually associated with a value. For example "temperature=?", "sernum=?", "maxframerate=?" commands would cause the camera to send back information. These values cannot be mutated so "temperature=20" would result in an error.
A value is something you can get and set that is usually associated with a value. For example "framerate=30" and "framerate=?" are two unique commands, but I consider the base string "framerate" to be a value command type because it can be both mutated and accessed.
The 250 unique commands can be reduced to ~100 CameraActions, CameraQuerys, and CameraValues. Instead of having 250 methods in my Camera class, I had an idea to compose command objects instead of individual setters, getters, and actions. The command string can be provided in the constructor, or reset with a setter. Then I could compose a CameraCommands object that holds all of the available commands, and provide that as a public member to my Camera.
//CameraAction.h =============================================
class CameraAction {
public:
CameraAction(std::string commandString, SerialInterface* serialInterface);
void operator()() { _serialInterface->sendString(_commandString); }
private:
SerialInterface* _serialInterface;
std::string _commandString;
};
//CameraValue.h =====================================================
class CameraValue {
public:
CameraValue(std::string commandString, double min, double max, SerialInterface* serialInterface);
void set(double value)
{
if(value > _maxValue) { throw std::runtime_error("value too high"); }
if(value < _minValue) { throw std::runtime_error("value too low"); }
std::string valueString = std::to_string(value);
_serialInterface->sendString(_commandString + "=" + valueString);
}
double get()
{
std::string valueString = _serialInterface->sendString(_commandString + "=?");
return atof(valueString.c_str());
}
private:
SerialInterface* _serialInterface;
std::string _commandString;
double _minValue;
double _maxValue;
};
//CameraCommands.h ===================================================
class CameraCommands {
public:
CameraCommands();
CameraAction reset;
CameraQuery temperature;
CameraValue framerate;
CameraValue sensitivity;
//... >100 more of these guys
};
//Camera.h ===========================================================
class Camera {
public:
Camera();
CameraCommands cmd;
void startAcquisition();
void stopAcquisition();
void setDataBuffer(void* buffer);
void setOtherThing(int thing);
};
so that the user could do something like:
Camera myCamera;
myCamera.cmd.reset();
myCamera.cmd.framerate.set(30);
myCamera.cmd.sensitivity.set(95);
double temperature = myCamera.cmd.temperature.get();
myCamera.startAcquisition();
etc...
The main problem here is that I'm exposing public member variables, which is supposed to be a massive no-no. Is my current object design logical, or should I simply implement 250 setters and getters and 100 more setters and getters to mutate the minimum and maximum settable values.
This seems kludgey to me because there are also many setters/getters associated with the Camera object that are unrelated to the user commands. It's nice for the user interface to provide the scope of the method (cmd) for the user to know whether something is being mutated physically in the camera, or just being mutated in the programmatic object (other methods). Is there any better way to design my program?
You've basically described an interesting hierarchy:
Command -> Query -> Value.
A Command holds the string that is the text of the command;
It can also offer a protected Send() method for its children to call.
A Query also holds a (protected) int variable (or whatever) that you can get() and/or operator int() immediately, or query() from the camera;
A Value adds the set() and/or operator =(int) command to Query.
The constructor (in particular) of Value can have min and max as you describe.
The Camera object can then have a number of public members:
class Camera {
private: // Classes that no-one else can have!
class Command; friend Command;
#include "Camera.Command.inc"
class Query; friend Query;
#include "Camera.Query.inc"
class Value; friend Value;
#include "Camera.Value.inc"
public: // Variables using above classes
Command reset;
Command open; // Maybe make this one private, for friends?
Command close; // Ditto?
Query temperature;
Query sernum;
Query maxFrameRate;
Value frameRate;
private: // Variables
SerialPort port; // Allow Command and co. access to this
}; // Camera
By organising it like this, then:
The user of the variables can't make impossible requests - there is no method to do so;
The query() and set() methods hide the mechanism to interface with the physical camera.
You'll note I've added #include "Camera.XXX.inc" in the middle of the Camera class. Note:
It doesn't clutter the Camera class with the definitions of those sub-Classes - but the C++ compiler needs them before you can use them, so you need to have them there. And if you want to know what they do, just open the file!
I gave them the .inc extension since they're "included" in the .h file: they don't stand alone as their own header file.
You can use one or more structs to group "settings", and then expose a method to set them:
typedef struct settings{
int setting1;
int setting2;
}MySettings;
class Myclass{
private :
int setting1;
int setting2;
public Myclass(MySettigs *settings)
{
if(null != settings){
setting1=settings->setting1;
setting2=settings->setting2;
}
}
public void ChangeSettings (MySettings *setting){
if(null != settings)
{
setting1=settings->setting1;
setting2=settings->setting2;
}
}
public void TakeSettings (MySettigs *settings){
[copy local variables into the passed struct]
}
I strongly advise to be careful when changing settings while the object is "operational".You can fall in an undefined state where settings are being changed while another thread is using them.
In your mentioned design I don't think exposing public members through composition is a big no-no.
When exposing public members, the big no-no is unsafe access to your class internals.
An example would be allowing public access to CameraValue::_maxValue. A user could change that value to anything, causing all sorts of undefined behaviour.
Were it up to me to design this I wouldn't have a CameraCommands member, as from the looks of it it doesn't add anything other then a level of indirection.
I would either add all the CameraAction and CameraValue members as part of the camera class, or inherit them.
Something like this:
Merging CameraCommands into Camera:
class Camera
{
public:
Camera();
CameraAction reset;
CameraQuery temperature;
CameraValue framerate;
CameraValue sensitivity;
//... >100 more of these guys
void startAcquisition();
void stopAcquisition();
void setDataBuffer(void* buffer);
void setOtherThing(int thing);
};
Inheriting CameraCommands into Camera:
class Camera : public CameraCommands
{
public:
Camera();
void startAcquisition();
void stopAcquisition();
void setDataBuffer(void* buffer);
void setOtherThing(int thing);
};
You can even provide some operators for CameraValue etc so that you can set a value through assignment (operator=), and get a value through either implicit conversion (operator T) or dereferencing (operator*):
template<typename T>
class CameraValue
{
public
CameraValue(SerialInterface*, std::string cmd);
CameraValue& operator=(const T& val)
{
_val = val;
std::string val_str = std::to_string(_val);
_ser_ifc->sendString(_cmd + "=" + val_str);
}
const T& get() const
{
return _val;
}
// implicit access to _val
operator const T&() const
{
return _val;
}
// dereference operator to access _val
const T& operator*() const
{
return _val;
}
private:
T _val;
SerialInterface* _ser_ifc;
std::string _cmd;
};
Then use CameraValue in your class as follows:
using CameraFramerate = CameraValue<int>;
CameraFramerate framerate;
The above techniques offer (IMO) a more composable use of Camera, such as the following:
Camera camera;
// setting values
camera.framerate = 30;
camera.sensitivity = 95;
// getting values
int framerate = camera.framerate; // uses operator T&()
int framerate = *camera.framerate; // uses operator*()
The key point here is that Camera::framerate etc don't allow any access that could change your camera class' internal state in an undefined and/or unsafe manner.

C++ Polymorphism and Derived Class Types - "ugly programming" with pointer type casts

First up, I'm not sure exactly how to describe what I'm doing in one line... hence the slightly vague title.
The shortest description of the problem I can give is that "I have a function, and it should be able to take as an argument any of the many possible types of class, and these classes are all derived from a base class".
Specifically, I have 2 categories of class and both implement different types of method, which are similar but not exactly the same.
Perhaps it is better if I just give the example? You will see I do some slightly weird things with pointer type-casts. I don't think these are good programming practices. They are slightly strange at least, and I am wondering if there is an alternative, better method of doing things.
Okay so here goes my attempt at a simplified example:
class device
{
// Nothing here - abstract base class
}
class inputDevice : device // inherit publicly, but it doesn't matter
{
virtual input* getInput() { return m_input; } // input is a class
}
class outputDevice : device
{
virtual output* getOutput() { return m_output; } // output is also a class
}
class inputoutputDevice : public inputDevice, public outputDevice
{
// Inherits the get methods from input and output types
}
// elsewhere in program
void do_something(device* dev, int mode_flag)
{
if(mode_flag == 1) // just an example
{
input* = ((inputDevice*)dev)->getInput(); // doing strange things with pointers
}
else if(mode_flag == 2)
{
output* = ((outputDevice*)dev)->getOutput(); // strange things with pointers
}
else if(mode_flag == 3)
{
}
}
So you see that the subtly here is that the function has some behavior dependent on whether we are dealing with an argument which is an input device or an output device.
I guess I could overload the function many times, but there could be many different types of input, output or both input and output devices... so that would be a rather convoluted method.
Putting the "get" methods into the base class doesn't seem like a good idea either, because the derived classes should NOT have the getInput() method if the device is an OUTPUT device. And similarly, and INPUT device should not have a getOutput() method. Conceptually that just doesn't seem right.
I hope that I explained that clearly enough and didn't make any mistakes.
To expand on my comment, if you look at e.g. this input/output library reference you will see a class diagram that in a way reminds much of your class hierarchy: There's a base-class (two actually), an "input" class, and "output" class, and an "input/output" class that inherits from both the "input" and "output" class.
However, you never really directly references the base classes std::basic_ios or std::ios_base, instead once only uses references to std::ostream for any output stream, and std::istream for any input streams (and std::iostream for any input and output stream).
For example, to overload the input operator >> your function takes a reference to a std::istream object:
std::istream& operator>>(std::istream& input_stream, some_type& dest);
Even for more generic functions you take a reference to either an std::istream, an std::ostream or an std::iostreamobject. You never use the base class std::basic_ios just because of the problems you have.
To relate more to your problem and how to solve it, use two function overloads, one for the input device, and one for the output device. It makes more sense because first of all you wont have the problem with checking type and casting, and also because the two functions will operate quite differently depending on if you are doing input or output anyway, and trying to mix it both into a single function just makes the code much more unmaintainable.
So you should instead have e.g.
void do_something(inputDevice& device);
and
void do_something(outputDevice& device);
There is an interesting design issue in your function do_something(): you assume that the type of the device corresponds to the mode parameter, but you have no way of verifying it.
Alternative 1: use dynamic cast
First of all, as you expect your device class to be polymorphic, you should foresee a virtual destructor. This would ensure that device is also polymorphic.
Then you can make use of dynamic casting to make your code reliable (here I assumed that mdode 3 was for input/output, but it's just for the general idea):
void do_something(device* dev, int mode_flag)
{
if(mode_flag == 1 || mode_flag==3) // just an example
{
inputDevice* id=dynamic_cast<inputDevice*>(dev); // NULL if it's not an input device
if (id) {
input* in = id->getInput(); // doing strange things with pointers
}
else cout << "Invalid input mode for device"<<endl;
}
if(mode_flag == 2 || mode_flag==3)
{
outputDevice* od=dynamic_cast<outputDevice*>(dev);
if (od) {
output* out = od->getOutput();
}
else cout << "Invalid output mode for device"<<endl;
}
// ...
}
Alternative 2: make do_something a method
I don't know how complex it is, but if you intend to do_something with any kind of devices, you could just make it a method.
class device {
public:
virtual void do_something(int mode_flag) = 0;
virtual ~device() {}
};
You'll get the idea. Of course, you could also have a mix having a global do_something() function doing general steps and calling member functions for the part which should depend on the type of device.
Other remarks
Note that your inputoutputDevice inherits twice from device. As soon as you have some members in device, this might lead to ambiguity. I'd therefore suggest you consider virtual inheritance for the device class.
class inputDevice : public virtual device
...;
class outputDevice : public virtual device
...;
Another approach could be to have a more elaborate input/output interface in device:
class device {
public:
virtual bool can_input() = 0; // what can the device do ?
virtual bool can_output() = 0;
virtual input* getInput() = 0;
virtual void setOutput(output*) = 0;
virtual ~device() {};
};
class inputDevice : public virtual device {
bool can_input() { return true; }
bool can_output() { return false; }
input* getInput() { return m_input; } // input is a class
void setOutput(output*) { throw 1; } // should never be called !
};
...
void do_something(device* dev, int mode_flag)
{
if(mode_flag == 1 && dev->can_input() ) // just an example
...
...
}
Since the problem domain is quite broad it is impossible to give an exact answer, but since it mentions device the linux kernel device model might be suitable.
See the linux-kernel Tag wiki for a deep dive. Look into LDD3 in there, since it's a free ebook, you can have a look how the kernel works internally.
The general concept of the linux kernel is that every device is represented by files. Hence you driver exports a file descriptors which have a vtable (See fs.h).
One of the simplest character devices is the named pipe (See its vtable and there is also all the function definitions in the file).
A simple C++ conversion could look like:
struct abstract_dev {
virtual int read(input *) { return -1; /* fail */ }
virtual int write(output *) { return -1; /* fail */ }
virtual int ioctl(int cmd, void **args) { return -1; }
};
struct input_dev : public abstract_dev {
input_dev() : state(0) {}
int state;
int read(input *) override {
if (state != 2) {
return -1;
}
/* do smth */
return 0;
}
int ioctl(int cmd, void **args) override {
if (cmd == 2) { state = 2; return 0;}
return -1;
}
};
For modes the kernel uses the ioctl system call to set the mode (as a control plane) and saves the state in the device-driver. and subsequent reads and writes take the mode into account. In the named pipe example you can change the internal buffer size, by setting the FIONREAD value.
I hope this helps to find a solution to your problem.

Finding An Alternative To Abusing Enums

In a project I've been helping with recently, the entire code base depends on a monstrous enum that's effectively used as keys for a glorified Hash Table. The only problem is now that it is HUGE, compiling whenever the enum changes is basically a rebuild for an already large code base. This takes forever and I would really LOVE to replace it.
enum Values
{
Value = 1,
AnotherValue = 2,
<Couple Thousand Entries>
NumValues // Sentinel value for creating arrays of the right size
}
What I'm looking for is ways to replace this enum but still have a system that is typesafe (No unchecked strings) and also compatible with MSVC2010 (no constexpr). Extra compiling overhead is acceptable as it might still be shorter time to compile than recompiling a bunch of files.
My current attempts can basically be summed up as delaying defining the values until link time.
Examples of its use
GetValueFromDatabase(Value);
AddValueToDatabase(Value, 5);
int TempArray[NumValues];
Edit: Compiletime and Runtime preprocessing is acceptable. Along with basing it off some kind of caching data structure at runtime.
One way you can achieve this is with a key class that wraps the numeric ID and which cannot be directly instantiated, therefore forcing references to be done through a type-safe variable:
// key.h
namespace keys {
// Identifies a unique key in the database
class Key {
public:
// The numeric ID of the key
virtual size_t id() const = 0;
// The string name of the key, useful for debugging
virtual const std::string& name() const = 0;
};
// The total number of registered keys
size_t count();
// Internal helpers. Do not use directly outside this code.
namespace internal {
// Lazily allocates a new instance of a key or retrieves an existing one.
const Key& GetOrCreate(const std::string& name, size_t id);
}
}
#define DECLARE_KEY(name) \
extern const ::keys::Key& name
#define DEFINE_KEY(name, id) \
const ::keys::Key& name = ::keys::internal::GetOrCreate(STRINGIFY(name), id)
With the code above, the definition of keys would look like this:
// some_registration.h
DECLARE_KEY(Value);
DECLARE_KEY(AnotherValue);
// ...
// some_registration.cpp
DEFINE_KEY(Value, 1);
DEFINE_KEY(AnotherValue, 2);
// ...
Importantly, the registration code above could now be split into several separate files, so that you do not need to recompile all the definitions at once. For example, you could break apart the registration into logical groupings, and if you added a new entry, only on the one subset would need to be recompiled, and only code that actually depended on the corresponding *.h file would need to be recompiled (other code that didn't reference that particular key value would no longer need to be updated).
The usage would be very similar to before:
GetValueFromDatabase(Value);
AddValueToDatabase(Value, 5);
int* temp = new int[keys::count()];
The corresponding key.cpp file to accomplish this would look like this:
namespace keys {
namespace {
class KeyImpl : public Key {
public:
KeyImpl(const string& name, size_t id) : id_(id), name_(name) {}
~KeyImpl() {}
virtual size_t id() const { return id_; }
virtual const std::string& name() const { return name_; }
private:
const size_t id_;
const std::string name_;
};
class KeyList {
public:
KeyList() {}
~KeyList() {
// This will happen only on program termination. We intentionally
// do not clean up "keys_" and just let this data get cleaned up
// when the entire process memory is deleted so that we do not
// cause existing references to keys to become dangling.
}
const Key& Add(const string& name, size_t id) {
ScopedLock lock(&mutex_);
if (id >= keys_.size()) {
keys_.resize(id + 1);
}
const Key* existing = keys_[id]
if (existing) {
if (existing->name() != name) {
// Potentially some sort of error handling
// or generation here... depending on the
// desired semantics, for example, below
// we use the Google Log library to emit
// a fatal error message and crash the program.
// This crash is expected to happen at start up.
LOG(FATAL)
<< "Duplicate registration of key with ID "
<< id << " seen while registering key named "
<< "\"" << name << "\"; previously registered "
<< "with name \"" << existing->name() << "\".";
}
return *existing;
}
Key* result = new KeyImpl(name, id);
keys_[id] = result;
return *result;
}
size_t length() const {
ScopedLock lock(&mutex_);
return keys_.size();
}
private:
std::vector<const Key*> keys_;
mutable Mutex mutex_;
};
static LazyStaticPtr<KeysList> keys_list;
}
size_t count() {
return keys_list->length();
}
namespace internal {
const Key& GetOrCreate(const std::string& name, size_t id) {
return keys_list->Add(name, id);
}
}
}
As aptly noted in the comments below, one drawback with an approach that allows for decentralized registration is that it then becomes possible to get into conflict scenarios where the same value is used multiple times (the example code above adds an error for this case, but this occurs at runtime, when it would be really nice to surface such a thing at compile time). Some ways to mitigate this include commit hooks that run tests checking for such a condition or policies on how to select the ID value that reduce the likelihood of reusing an ID, such as a file that indicates the next available ID that must be incremented and submitted as a way to allocate IDs. Alternatively, assuming that you are permitted to reshuffle the IDs (I assumed in this solution that you must preserve the current IDs that you already have), you could change the approach so that the numeric ID is automatically generated from the name (e.g. by taking a hash of the name) and possibly use other factors such as __FILE__ to deal with collisions so that IDs are unique.

Splitting a file and passing the data on to other classes

In my current project, I have a lot of binary files of different formats. Several of them act as simple archives, and therefore I am trying to come up with a good approach for passing extracted file data on to other classes.
Here's a simplified example of my current approach:
class Archive {
private:
std::istream &fs;
void Read();
public:
Archive(std::istream &fs); // Calls Read() automatically
~Archive();
const char* Get(int archiveIndex);
size_t GetSize(int archiveIndex);
};
class FileFormat {
private:
std::istream &fs;
void Read();
public:
FileFormat(std::istream &fs); // Calls Read() automatically
~FileFormat();
};
The Archive class basically parses the archive and reads the stored files into char pointers.
In order to load the first FileFormat file from an Archive, I would currently use the following code:
std::ifstream fs("somearchive.arc", std::ios::binary);
Archive arc(fs);
std::istringstream ss(std::string(arc.Get(0), arc.GetSize(0)), std::ios::binary);
FileFormat ff(ss);
(Note that some files in an archive could be additional archives but of a different format.)
When reading the binary data, I use a BinaryReader class with functions like these:
BinaryReader::BinaryReader(std::istream &fs) : fs(fs) {
}
char* BinaryReader::ReadBytes(unsigned int n) {
char* buffer = new char[n];
fs.read(buffer, n);
return buffer;
}
unsigned int BinaryReader::ReadUInt32() {
unsigned int buffer;
fs.read((char*)&buffer, sizeof(unsigned int));
return buffer;
}
I like the simplicity of this approach but I'm currently struggling with a lot of memory errors and SIGSEGVs and I'm afraid that it's because of this method. An example is when I create and read an archive repeatedly in a loop. It works for a large number of iterations, but after a while, it starts reading junk data instead.
My question to you is if this approach is feasible (in which case I ask what I am doing wrong), and if not, what better approaches are there?
The flaws of code in the OP are:
You are allocating heap memory and returning a pointer to it from one of your functions. This may lead to memory leaks. You have no problem with leaks (for now) but you must have such stuff in mind while designing your classes.
When dealing with Archive and FileFormat classes user always has to take into account the internal structure of your archive. Basically it compromises the idea of data incapsulation.
When user of your class framework creates an Archive object, he just gets a way to extract a pointer to some raw data. Then the user must pass this raw data to completely independent class. Also you will have more than one kind of FileFormat. Even without the need to watch for leaky heap allocations dealing with such system will be highly error-prone.
Lets try to apply some OOP principles to the task. Your Archive object is a container of Files of different format. So, an Archive's equivalent of Get() should generally return File objects, not a pointer to raw data:
//We gonna need a way to store file type in your archive index
enum TFileType { BYTE_FILE, UINT32_FILE, /*...*/ }
class BaseFile {
public:
virtual TFileType GetFileType() const = 0;
/* Your abstract interface here */
};
class ByteFile : public BaseFile {
public:
ByteFile(istream &fs);
virtual ~ByteFile();
virtual TFileType GetFileType() const
{ return BYTE_FILE; }
unsigned char GetByte(size_t index);
protected:
/* implementation of data storage and reading procedures */
};
class UInt32File : public BaseFile {
public:
UInt32File(istream &fs);
virtual ~UInt32File();
virtual TFileType GetFileType() const
{ return UINT32_FILE; }
uint32_t GetUInt32(size_t index);
protected:
/* implementation of data storage and reading procedures */
};
class Archive {
public:
Archive(const char* filename);
~Archive();
BaseFile* Get(int archiveIndex);
{ return (m_Files.at(archiveIndex)); }
/* ... */
protected:
vector<BaseFile*> m_Files;
}
Archive::Archive(const char* filename)
{
ifstream fs(filename);
//Here we need to:
//1. Read archive index
//2. For each file in index do something like:
switch(CurrentFileType) {
case BYTE_FILE:
m_Files.push_back(new ByteFile(fs));
break;
case UINT32_FILE:
m_Files.push_back(new UInt32File(fs));
break;
//.....
}
}
Archive::~Archive()
{
for(size_t i = 0; i < m_Files.size(); ++i)
delete m_Files[i];
}
int main(int argc, char** argv)
{
Archive arch("somearchive.arc");
BaseFile* pbf;
ByteFile* pByteFile;
pbf = arch.Get(0);
//Here we can use GetFileType() or typeid to make a proper cast
//An example of former:
switch ( pbf.GetFileType() ) {
case BYTE_FILE:
pByteFile = dynamic_cast<ByteFile*>(pbf);
ASSERT(pByteFile != 0 );
//Working with byte data
break;
/*...*/
}
//alternatively you may omit GetFileType() and rely solely on C++
//typeid-related stuff
}
Thats just a general idea of the classes that may simplify the usage of archives in your application.
Have in mind though that good class design may help you with memory leaks prevention, code clarification and such. But whatever classes you have you will still deal with binary data storage problems. For example, if your archive stores 64 bytes of byte data and 8 uint32's and you somehow read 65 bytes instead of 64, the reading of the following ints will give you junk. You may also encounter alignment and endianness problems (the latter is important if you applications are supposed to run on several platforms). Still, good class design may help you to produce a better code which addresses such problems.
It is asking for trouble to pass a pointer from your function and expect the user to know to delete it, unless the function name is such that it is obvious to do so, e.g. a function that begins with the word create.
So
Foo * createFoo();
is likely to be a function that creates an object that the user must delete.
A preferable solution would, for starters, be to return std::vector<char> or allow the user to pass std::vector<char> & to your function and you write the bytes into it, setting its size if necessary. (This is more efficient if doing multiple reads where you can reuse the same buffer).
You should also learn const-correctness.
As for your "after a while it fills with junk", where do you check for end of file?

Static vs. member variable

For debugging, I would like to add some counter variables to my class. But it would be nice to do it without changing the header to cause much recompiling.
If Ive understood the keyword correctly, the following two snippets would be quite identical. Assuming of course that there is only one instance.
class FooA
{
public:
FooA() : count(0) {}
~FooA() {}
void update()
{
++count;
}
private:
int count;
};
vs.
class FooB
{
public:
FooB() {}
~FooB() {}
void update()
{
static int count = 0;
++count;
}
};
In FooA, count can be accessed anywhere within the class, and also bloats the header, as the variable should be removed when not needed anymore.
In FooB, the variable is only visible within the one function where it exists. Easy to remove later. The only drawback I can think of is the fact that FooB's count is shared among all instances of the class, but thats not a problem in my case.
Is this correct use of the keyword? I assume that once count is created in FooB, it stays created and is not re-initialized to zero every call to update.
Are there any other caveats or headsup I should be aware of?
Edit: After being notified that this would cause problems in multithreaded environments, I clarify that my codebase is singlethreaded.
Your assumptions about static function variables are correct. If you access this from multiple threads, it may not be correct. Consider using InterlockedIncrement().
What you really want, for your long term C++ toolbox is a threadsafe, general purpose debug counters class that allows you to drop it in anywhere and use it, and be accessible from anywhere else to print it. If your code is performance sensitive, you probably want it to automatically do nothing in non-debug builds.
The interface for such a class would probably look like:
class Counters {
public:
// Counters singleton request pattern.
// Counters::get()["my-counter"]++;
static Counters& get() {
if (!_counters) _counters = new Counters();
}
// Bad idea if you want to deal with multithreaded things.
// If you do, either provide an Increment(int inc_by); function instead of this,
// or return some sort of atomic counter instead of an int.
int& operator[](const string& key) {
if (__DEBUG__) {
return _counter_map.operator[](key);
} else {
return _bogus;
}
}
// you have to deal with exposing iteration support.
private:
Counters() {}
// Kill copy and operator=
void Counters(const Counters&);
Counters& operator=(const Counters&);
// Singleton member.
static Counters* _counters;
// Map to store the counters.
std::map<string, int> _counter_map;
// Bogus counter for opt builds.
int _bogus;
};
Once you have this, you can drop it in at will wherever you want in your .cpp file by calling:
void Foo::update() {
// Leave this in permanently, it will automatically get killed in OPT.
Counters::get()["update-counter"]++;
}
And in your main, if you have built in iteration support, you do:
int main(...) {
...
for (Counters::const_iterator i = Counters::get().begin(); i != Countes::get().end(); ++i) {
cout << i.first << ": " << i.second;
}
...
}
Creating the counters class is somewhat heavy weight, but if you are doing a bunch of cpp coding, you may find it useful to write once and then be able to just link it in as part of any lib.
The major problems with static variables occur when they are used together with multi-threading. If your app is single-threaded, what you are doing is quite correct.
What I usually do in this situation is to put count in a anonymous namespace in the source file for the class. This means that you can add/remove the variable at will, it can can used anywhere in the file, and there is no chance of a name conflict. It does have the drawback that it can only be used in functions in the source file, not inlined functions in the header file, but I think that is what you want.
In file FooC.cpp
namespace {
int count=0;
}
void FooC::update()
{
++count;
}