read data in initialization list - c++

At class instantiation, I would like to read data from a file and process it into a number of class objects. What I did so far (and works well) is
myData::myData(const std::string & file):
data1_(this->read(file)),
processedData1_(this->createProcessedData1_(data1_)),
processedData2_(this->createProcessedData2_(data1_)),
processedData3_(this->createProcessedData3_(data1_))
{
}
In a different class, the read() method creates more than one raw data object. In this case, I don't know how to pack things into the initializer list, so I'm doing something along the lines of
myData::myData(const std::string & file):
data1_(),
data2_(),
processedData1_(),
processedData2_(),
processedData3_()
{
this->read(file); // fills data1_, data2_
processedData1_ = this->createProcessedData1_(data1_, data2_);
processedData2_ = this->createProcessedData2_(data1_, data2_);
processedData3_ = this->createProcessedData3_(data1_, data2_);
}
What I don't like about this approach is that
the data is initalized twice: once (void) in the initializer list, once filled with actual content in the constructor; and that
I cannot mark any of the (processed) data objects as const.
Is there a way to organize the object creation such that it all happens in the initialization list?

You may think about splitting the data loading / processing in a factory-like static method that in turn constructs a myData instance (passing the processedData*_ values as constructor params).
This way you can keep the loading and processing separate from the class that may end up just storing the results and possibly provide further processing or accessors to parts of the data.
Could something like
class MyData {
public:
MyData(DataType processedData1, ...) : processedData1_(processedData1) ... { }
private:
const DataType processedData1_;
}
struct DataContainer {
DataType data1;
DataType data2;
}
DataContainer read(const std::string& file) { ... }
DataType createProcessedData1(DataType data) { ... }
...
// hands ownership to caller
MyData* LoadData(const std::string & file) {
DataContainer d = read(file);
return new MyData(createProcessedData1(d.data1), createProcessedData2(d.data2), ..)
}
work for you?
I'm assuming you don't need to keep state when loading and processing data. If this isn't the case you can make read and createProcessedData* members of a MyDataLoader class.

The only way to supply a value to a const member variable (without const cast or using mutable keyword) would be to provide it in the constructor initializer list. See this: How to initialize a const field in constructor.
If you prefer not to refrain to the 'ugly' casts or keywords and need your data to be be const (for example for use in const functions), I would go for a small myDataInitialiser class that would first read the data. Once data is read, you could pass the whole instance of initialiser to the constructor of your original myData class.

Related

Efficient way of storing a large amount of character data between transactions in C++

For our application we have the following scenario:
Firstly, we get a large amount of data (on cases, this can be more than 100MB) through a 3rd party API into our class via a constructor, like:
class DataInputer
{
public:
DataInputer(int id, const std::string& data) : m_id(id), m_data(data) {}
int handle() { /* Do some stuff */ }
private:
std::string m_id;
std::string m_data;
};
The chain of invocation going into our class DataInputer looks like:
int dataInputHandler()
{
std::string inputStringFromThirdParty = GiveMeStringFrom3rdPartyMagic(); // <- 1.
int inputIntFromThirdParty = GiveMeIntFrom3rdPartyMagic();
return DataInputer(inputIntFromThirdParty, inputDataFromThirdParty).handle();
}
We have some control over how the dataInputHandler handles its string (Line marked with 1. is the place where the string is created as an actual object), but no control for what GiveMeStringFrom3rdPartyMagic actually uses to provide it (if it's important for anyone, this data is coming from somewhere via a network connection) so we need. As a consolation we have full control over the DataInputer class.
Now, what the application is supposedly doing is to hold on to the string and the associated integer ID till a later point when it can send to another component (via a different network connection) provided the component provides a valid ID (this is the short description). The problem is that we can't (don't want to) do it in the handle method of the DataInputer class, it would block it for an unknown amount of time.
As a rudimentary solution, we were thinking on creating an "in-memory" string store for all the various strings that will come in from all the various network clients, where the bottom line consists of a:
std::map<int, std::string> idStringStore;
where the int identifies the id of the string, the string is actually the data and DataInputer::handle does something like idStringStore.emplace(m_id, m_data);:
The problem is that unnecessarily copying a string which is on the size of 100s of megabytes can be a very time consuming process, so I would like to ask the community if they have any recommendations or best practices for scenarios like this.
An important mention: we are bound to C++11 for now :(
Use move-semantics to pass the 3rd-party data into your DataInputer constructor. The std::move here is redundant but makes the intention clear to the reader:
class DataInputer
{
public:
DataInputer(int id, std::string&& data) : m_id(id), m_data(std::move(data)) {}
int handle() { /* Do some stuff */ }
private:
std::string m_id;
std::string m_data;
};
And pass GiveMeStringFrom3rdPartyMagic() directly as an argument to the constructor without first copying into inputStringFromThirdParty.
int dataInputHandler()
{
int inputIntFromThirdParty = GiveMeIntFrom3rdPartyMagic();
return DataInputer(inputIntFromThirdParty, GiveMeStringFrom3rdPartyMagic()).handle();
}
Of course, you can use a std::map or any other STL container that supports move-semantics. The point is that move-semantics, generally, is what you're looking to use to avoid needless copies.

Updating data members of different derived classes of the same base class within a vector

I am writing a 3D gridded model in C++ which has different cell types, all stored within a vector that is in a Grid class. I have defined a base GridCell class and I also have two derived classes GridCell1 and GridCell2.
Now in setting up the model, I read in a text file that tells me how to fill my gridCell vector (std::vector<gridCell*> gridCellVector) in the Grid class; meaning it tells me what types of derived cells to push_back into my gridCellVector.
Then I read in another input file that contains initial state variable information for each GridCell in my Grid, in the order laid out by the 1st input file.
Each derived class (GridCell1 and GridCell2) has some state variables (private data members) that the other doesn't. How can I (or is it possible to) access and update/initialize/set the derived class' data members as I read in the second input file?
I've tried a couple different things and seem only able to return my get/set functions defined in the GridCell base class. I can't figure out how to access the functions in the derived classes when working with each derived GridCell as I step through the vector.
Edit: I am surprised people haven't mentioned downcasting, other than saying not to use dynamic_cast. I always know the type of GridCell I am updating because I keep track of what has been loaded into the vector when reading in the first input file. Since i am always certain of the type of GridCell, isn't dynamic_cast safe?
Double Edit:. Because I pass the GridCell objects to other functions that need to reference the data members and functions specific to the appropriate GridCell instance of the passed object, I'm realizing the design (of many parts) of my model does not currently pass muster. So, for now, I'm giving up on the idea of having to ride the GridCelltypes at all and will just create one huge GridCell class that fits all my needs. This way I can fill, and then access, whatever data members and functions I need later on down the line.
If you're sure you want to use a two-step process, I suggest you give GridCell a pure virtual init method:
virtual void init(istream &) = 0;
then implement it in each derived class. Its purpose is to read data from the file and initialize the initial state variables.
Single pass
As others have said, it may be best to read both files at once and do the derived class specific initialization at the same time as creating the derived classes:
std::unique_ptr<GridCell> createGridCell1(std::istream& init) {
auto cell = std::make_unique<GridCell1>();
int value;
init >> value;
cell->setGridCell1State(value);
return cell;
}
std::unique_ptr<GridCell> createGridCell2(std::istream& init) {
// similarly to CreateGridCell1()...
}
std::vector<GridCell::Ptr> createCells(std::istream& types, std::istream& init) {
std::vector<GridCell::Ptr> cells;
std::string type;
while (types >> type) {
if (type == "GridCell1")
cells.push_back(createGridCell1(init));
else
cells.push_back(createGridCell2(init));
}
return cells;
}
int main() {
auto types = std::istringstream("GridCell1 GridCell2 GridCell1 GridCell1");
auto init = std::istringstream("1 2.4 2 3");
auto cells = createCells(types, init);
for (auto& cell : cells)
cell->put();
}
Live demo.
Two pass with Visitor
If you must do the initialization in a second pass you could use the Visitor pattern. You have some sort of GridCellVisitor that knows how to visit all the different kinds of grid cells:
class GridCellVisitor {
protected:
~GridCellVisitor() = default;
public:
virtual void visit(GridCell1& cell) = 0;
virtual void visit(GridCell2& cell) = 0;
};
and your grid cells know how to accept a GridCellVisitor:
class GridCell1 : public GridCell {
int state = 0;
public:
void setGridCell1State(int value) { state = value; }
void accept(GridCellVisitor& visitor) override { visitor.visit(*this); }
};
class GridCell2 : public GridCell {
double state = 0.0;
public:
void setGridCell2State(double value) { state = value; }
void accept(GridCellVisitor& visitor) override { visitor.visit(*this); }
};
This way you can separate the responsibility of initializing the grid cells with an input stream from the grid cells themselves and avoid having to do fragile downcasts on the grid cells:
class GridCellStreamInitializer : public GridCellVisitor {
std::istream* in;
public:
GridCellStreamInitializer(std::istream& in) : in(&in){}
void visit(GridCell1& cell) override {
int value;
*in >> value;
cell.setGridCell1State(value);
}
void visit(GridCell2& cell) override {
double value;
*in >> value;
cell.setGridCell2State(value);
}
};
int main() {
auto in = std::istringstream("GridCell1 GridCell2 GridCell1 GridCell1");
auto cells = createCells(in);
auto init = std::istringstream("1 2.4 2 3");
auto streamInitializer = GridCellStreamInitializer(init);
for (auto& cell : cells)
cell->accept(streamInitializer);
}
Live demo.
The downside is GridCellVisitor must be aware of all different kinds of grid cells so if you add a new type of grid cell you have to update the visitor. But as I understand it your code that reads the initialization file must be aware of all the different kinds of grid cells anyway.
Your vector<gridCell*> knows only the base class of its elements and can hence only call gridCell functions.
I understand that your approach, is to first fill the vector with pointer to cells of the correct derived type, and never the base type. Then for each cell, you read class dependent data.
The easiest way, if you don't want to change approach
The cleanest way would be to define a virtual load function in the base cell:
class gridCell {
...
virtual bool load (ifstream &ifs) {
// load the common data of all gridCells and derivates
return ifs.good();
}
};
The virtual function would be overriden by teh derived cells:
class gridCell1 : public gridCell {
...
bool load (ifstream &ifs) override {
if (gridCell::load(ifs)) { // first load the common part
// load the derivate specific data
}
return ifs.good();
}
};
Finally, you can write your container loading function:
class Grid {
...
bool load (ifstream &ifs) {
for (auto x:gridCellVector)
if (!x->load(ifs))
break; // error ? premature end of file ? ...
}
};
The cleanest way ?
Your problem looks very much like a serialisation problem. You load grids, may be you write grids as well ? If you control the file format, and perform the creation and loading of the cells in a single pass, then you don't need to reinvent the wheel and could opt for a serialisation library, like boost::serialization.

generic message dispatching library?

Is there a standard way to get rid of the switch/case block in a read loop?
i.e.
enum msg_type
{
message_type_1,
//msg types
}
struct header
{
msg_type _msg_type;
uint64_t _length;
}
struct message1
{
header _header;
//fields
}
struct message2
{
header _header;
//fields
}
//socket read loop
void read(//blah)
{
//suppose we have full message here
char* buffer; //the buffer that holds data
header* h = (header*)buffer;
msg_type type = h->_msg_type;
switch(type)
{
case msg_type_1:
message1* msg1 = (message1*)buffer;
//Call handler function for this type
//rest
}
}
this means that I have to inherit from a handler container base class which is of the form:
class handler_container_base
{
public:
virtual void handle(message1* msg){}
virtual void handle(message2* msg){}
//etc
}
and pass an object of that type to where the message loop can see and ask him to call those back.
One problem is, even when I want to implement and register only one handler for a single type I have to inherit from this class.
Another is this just looks ugly.
I was wondering if there are existing libraries which handle this problem (should be free). Or is there no better way of doing this rather than like this?
Other approaches that avoid inheritance are:
For a closed set of types:
Use a variant:
variant<message1_t, message2_t> my_message;
With a visitor you can do the rest. I recommend boost.variant.
You can also use a boost::any, for an open set of types, and copy the messages around at runtime.At some point you will have to cast back to the original type, though.
Another solution goes along the lines of Poco.DynamicAny, which will try to convert, to the type on the left in an assignment, similar to a dynamic language. But you need to register converters yourself for your types.

Initialising a 2D vector with values at declaration

I'm currently working on a program which randomly generates items (e.g. weapons, armour etc.) and I want to make global constant vectors which hold all the names that could be given to the items. I want to have this 2D vector in a header file that's available to all my other classes (but not modifiable), so I need to initialise it at declaration.
I previously used the following:
static const std::string v[] =
{
"1.0", "1.1", "1.2", "null"
};
const std::vector<std::string> versions( v, v+sizeof( v)/sizeof( v[0]));
This worked for a 1D vector, however I want to use a 2D vector to store item names.
I have tried using the following however it means I don't have the member functions (such as size()):
static const std::string g_wn_a[] = { "Spear", "Lance", "Jouster" };
static const std::string g_wn_b[] = { "Sword", "Broadsword", "Sabre", "Katana" };
const std::string* g_weapon_names[] = { g_wn_a, g_wn_b };
I also don't want to use a class to store all the names because I feel it would be inefficient to have variables created to store all the names everytime I wanted to use them.
Does anyone know how I can solve my problem?
You could use a class with const static members. This way, your class would just behave like a namespace and you wouldn't have to create an instance of the name-holding class to use the names.
struct MyNames {
// const static things
const static string myweapon = "Katana"
};
string s = MyNames::myweapon; // s = "Katana"
This is C++, so the most common way to do this is to write a class that does this in its constructor, and then create a const object of that class. Your class would then provide various member functions to query the various items it maintains.
As a bonus, this will make it easier for the rest of your code to use the various items.

How to write a cctor and op= for a factory class with ptr to abstract member field?

I'm extracting files from zip and rar archives into raw buffers. I created the following to wrap minizip and unrarlib:
Archive.hpp - Used to access everything. If I could make all the functions in the other classes inaccessible from the outside, I would. (Actually, I suppose I could friend all the other classes in Archive and use private function callbacks..., but that's soo roundabout.)
#include "ArchiveBase.hpp"
#include "ArchiveDerived.hpp"
class Archive {
public:
Archive(string path) {
/* logic here to determine type */
switch(type) {
case RAR:
archive_ = new ArchiveRar(path);
break;
case ZIP:
archive_ = new ArchiveZip(path);
break;
case UNKNOWN_ARCHIVE:
throw;
break;
}
}
Archive(Archive& other) {
archive_ = // how do I copy an abstract class?
}
~Archive() { delete archive_; }
void passThrough(ArchiveBase::Data& data) { archive_->passThrough(data); }
Archive& operator = (Archive& other) {
if (this == &other) return *this;
ArchiveBase* newArchive = // can't instantiate....
delete archive_;
archive_ = newArchive;
return *this;
}
private:
ArchiveBase* archive_;
}
ArchiveBase.hpp
class ArchiveBase {
public:
// Is there any way to put this struct in Archive instead,
// so that outside classes instantiating one could use
// Archive::Data instead of ArchiveBase::Data?
struct Data {
int field;
};
virtual void passThrough(Data& data) = 0;
/* more methods */
}
ArchiveDerived.hpp "Derived" being "Zip" or "Rar"
#include "ArchiveBase.hpp"
class ArchiveDerived : public ArchiveBase {
public:
ArchiveDerived(string path);
void passThrough(ArchiveBase::Data& data);
private:
/* fields needed by minizip/unrarlib */
// example zip:
unzFile zipFile_;
// example rar:
RARHANDLE rarFile_;
}
ArchiveDerived.cpp
#include "ArchiveDerived.hpp"
ArchiveDerived::ArchiveDerived(string path) { //implement }
ArchiveDerived::passThrough(ArchiveBase::Data& data) { //implement }
Somebody had suggested I use this design so that I could do:
Archive archiveFile(pathToZipOrRar);
archiveFile.passThrough(extractParams); // yay polymorphism!
How do I write a cctor for Archive?
What about op= for Archive?
What can I do about "renaming" ArchiveBase::Data to Archive::Data? (Both minizip and unrarlib use such structs for input and output. Data is generic for Zip & Rar and later is used to create the respective library's struct.) Everything else is accessed via Archive, and I'd like to make declaring Data in an outside class this way as well.
I know I could throw away my current class Archive, name ArchiveBase into Archive, and use a global factory function. However, I wanted to avoid using the global function.
First of all you can't "copy" an abstract class because you can't instantiate one. Instead, what you should do is set up a std::tr1::shared_ptr of that class and pass in a pointer.
Archive(ArchiveBase *_archiveBase)
Use a factory function outside of the Archive class for instantiation.
Archive createArchive(string _path, int _type){
switch(type) {
case RAR:
return Archive( new ArchiveRar(path) );
case ZIP:
return Archive( new ArchiveZip(path) );
case UNKNOWN_ARCHIVE:
throw exception("Unknown archive format");
break;
default:
throw exception("Improper archive type");
}
For the = operator, simply holding onto a smart pointer such as this and using the "=" will perform the safe transfer of knowledge between classes. It performs reference counting and will delete the pointer so you don't have to and only when it's safe to do so.
Archive& operator = (Archive& other) {
m_ArchiveBasePtr = other.m_ArchiveBasePtr;
return *this;
}
Let the smart pointers worry about deleting, copying, and all that for you.
Wheaties suggestion works when you can afford shallow copies and an N-1 relationship. It breaks down when ArchiveBase subclasses contain specific 1-on-1 data for each Archive instance, and doesn't share across multiple objects gracefully.
An alternative approach to a global createArchive() function is to add an abstract virtual clone() method to ArchiveBase, and then define it in each subclass (ArchiveZip,ArchiveRar) to appropriately replicate a shallow or deep copy as needed.
You can then call archive_.clone() from Archive's copy constructor or operator= if archive_ is not NULL. (Be sure to delete (free) it afterwards!)
What can I do about "renaming" ArchiveBase::Data to Archive::Data? (Both minizip and unrarlib use such structs for input and output. Data is generic for Zip & Rar and later is used to create the respective library's struct.)
There are several options:
Archive::passThrough() { archive_ -> passThrough( this->getData() ) }
Archive::passThrough() { archive_ -> passThrough( this ) }
Or you could maintain a backward reference in ArchiveBase to the corresponding Archive object, which could then be queried for Data.
Though use care! It's easy to get such duplicated information out of sync. (And you can get into header-file loops.) That's why I favor passing the this pointer around! You can always predeclare "class Archive;" and then use Archive* pointers without including Archive.hpp in the header file. (Though you will still need to include Archive.hpp in the .cpp file.)