I have the following functions
LinearScheme::LinearScheme() {
cout << " empty constructor" << endl;
}
void LinearScheme::init(
int tableId,
std::string &basePath,
std::vector<size_t> &colElemSizes,
TupleDescMap &tupleDescMap,
size_t defaultMaxFragmentSize,
int numCols,
BoundBases &bounds,
std::vector<int> &colsPartitioned )
{
// This linear scheme ignores bounds
// it could be improved to use colsPartitioned for ordering (TODO)
cout << "init Linear Scheme " << endl;
*this = LinearScheme(); //SEGFAULTS HERE
cout << "after cons here?" << endl;
// init private fields
this->tableId_ = tableId;
this->basePath_ = basePath;
this->colElemSizes_ = colElemSizes;
this->numCols_ = numCols;
this->tupleDescMap_ = tupleDescMap;
this->numFragments_ = 0;
this->defaultMaxFragmentSize_ = defaultMaxFragmentSize;
// fragmentSizesFilename_ init
fragmentSizesFilename_ = basePath_ + boost::lexical_cast <string>(tableId_)
+ "_cs";
struct stat st;
// open existing file if exists. Create new otherwise.
if (stat(fragmentSizesFilename_.c_str(), &st) == 0) // file existed
openExisting();
else
createNew();
}
The reason I am initializing in init rather than constructor is because LinearScheme extends a PartitionScheme (super class with virtual methods) class and another class does that where the constructor is used recursively.
I have a QuadTree class which does the same initialization because each QuadTree constructor is applied recursively. *this = QuadTree(bounds, maxSize) line in the init function of QuadTree class works just fine.
however, this line in the other subclass (LinearScheme) *this = LinearScheme() cause a Seg fault.
Any ideas why this might happen?
EDIT
Also replacing the line:
*this = LinearScheme()
with this:
*this;
or removing it overall gets rid of the Seg Fault ... why?
Sounds like incorrect factory method / builder / deferred construction usage. For many of these object creation patterns function that constructs your objects should be a static method because there doesn't yet exist an instance to manipulate. In others you potentially manipulate an already constructed instance. In either case if you are actually constructing the object of the class type within the function you should be using new and eventually returning it.
If you are instead going for a helper method to assist with initialization then you simply shouldn't be constructing the object within the method itself, and you should just be initializing parts of it within your helper.
A factory pattern example:
LinearScheme* LinearScheme::create(...all_your_args....) {
/* construct the thing we are building only if it
* pass any arguments into him that he can handle directly if you'd like
*/
LinearScheme *out = new LinearScheme(...);
/* do whatever else you have to do */
....
return out;
}
or this helper of sorts that you seem to want
/* this time let's just do 'init' on your object */
void LinearScheme::init(....args....) {
/* possibly check if init has been done already */
if ( this->init ) return;
/* proceed to do your initialization stuff
* but don't construct the 'this' instance since it should already exist
*/
this->init = true; //so we don't init again if you don't need multiple init's
}
Alternatively you can consider the delegate constructor methods in C++11 alex mentions.
However neither of these really strikes me as being the actual problem here.
It's not working because either you probably don't even have a valid *this to deference. This could be because of your usage, or it could be because one failed to create potentially because of infinite recursion.
Here's a wikipedia link on the pattern: http://en.wikipedia.org/wiki/Factory_method_pattern
Given what you have said about having to keep passing a dozen arguments around both to parent classes and for your recursive construction, one suggestion you could consider is making a small config struct that you pass along by reference instead of all the discrete parameters. That way you don't have to keep adjusting every signature along the way each time you add / remove another parameter.
The other idea is to seperate entirely the construction of one of your objects from the responsibility of knowing how, where, and when they should be contructed and inserted into your hierarchy. Hard to say without understanding how you will actually be using LinearSchme and what the interface is.
"...in the other subclass (LinearScheme) *this = LinearScheme()"
"The LinearScheme constructor is empty: LinearScheme::LinearScheme()"
if *this is a subclass of LinearMethod, LinearMethod's constructor should already have been called and this line is useless. Besides it calls assignment operator - is it properly defined?
It is better to rely on built-in mechanism of constructing of objects. If you want to avoid code repetition, use C++11 delegating constructors feature. It was specially designed to eliminate "init" methods.
Although, "If there is an infinitely recursive cycle (e.g., constructor C1 delegates to another constructor C2, and C2 also delegates to C1), the behavior is undefined."
So it is up to you to avoid infinite recursion. In your QuadTree you can consider creating nullptr pointers to QuadTreeNode in constructor.
Related
I have a class which is loaded from an external file, so ideally I would want its constructor to load from a given path if the load fails, I will want to throw an error if the file is not found/not readable (Throwing errors from constructors is not a horrible idea, see ISO's FAQ).
There is a problem with this though, I want to handle errors myself in some controlled manner, and I want to do that immediately, so I need to put a try-catch statement around the constructor for this object ... and if I do that, the object is not declared outside the try statement, i.e.:
//in my_class.hpp
class my_class
{
...
public:
my_class(string path);//Throws file not found, or other error error
...
};
//anywhere my_class is needed
try
{
my_class my_object(string);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
//Problem... now my_object doesn't exist anymore
I have tried a number of ways of getting around it, but I don't really like any of them:
Firstly, I could use a pointer to my_class instead of the class itself:
my_class* my_pointer;
try
{
my_class my_pointer = new my_class(string);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
The problem is that the instance of this object doesn't always end up in the same object which created it, so deleting all pointers correctly would be easy to do wrong, and besides, I personally think it is ugly to have some objects be pointers to objects, and have most others be "regular objects".
Secondly, I could use a vector with only one element in much the same way:
std::vector<my_class> single_vector;
try
{
single_vector.push_back(my_class(string));
single_vector.shrink_to_fit();
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
I don't like the idea of having a lot of single-element vectors though.
Thirdly, I can create an empty faux constructor and use another loading function, i.e.
//in my_class.hpp
class my_class
{
...
public:
my_class() {}// Faux constructor which does nothing
void load(string path);//All the code in the constructor has been moved here
...
};
//anywhere my_class is needed
my_class my_object
try
{
my_object.load(path);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
This works, but largely defeats the purpose of having a constructor, so I don't really like this either.
So my question is, which of these methods for constructing an object, which may throw errors in the constructor, is the best (or least bad)? and are there better ways of doing this?
Edit: Why don't you just use the object within the try-statement
Because the object may need to be created as the program is first started, and stopped much later. In the most extreme case (which I do actually need in this case also) that would essentially be:
int main()
{
try
{
//... things which might fail
//A few hundred lines of code
}
catch(/*whaveter*/)
{
}
}
I think this makes my code hard to read since the catch statement will be very far from where things actually went wrong.
One possibility is to wrap the construction and error handling in a function, returning the constructed object. Example :
#include <string>
class my_class {
public:
my_class(std::string path);
};
my_class make_my_object(std::string path)
{
try {
return {std::move(path)};
}
catch(...) {
// Handle however you want
}
}
int main()
{
auto my_object = make_my_object("this path doesn't exist");
}
But beware that the example is incomplete because it isn't clear what you intend to do when construction fails. The catch block has to either return something, throw or terminate.
If you could return a different instance, one with a "bad" or "default" state, you could have just initialized your instance to that state in my_class(std::string path) when it was determined the path is invalid. So in that case, the try/catch block is not needed.
If you rethrow the exception, then there is no point in catching it in the first place. In that case, the try/catch block is also not needed, unless you want to do a bit of extra work, like logging.
If you want to terminate, you can just let the exception go uncaught. Again, in that case, the try/catch block is not needed.
The real solution here is probably to not use a try/catch block at all, unless there is actually error handling you can do that shouldn't be implemented as part of my_class which isn't made apparent in the question (maybe a fallback path?).
and if I do that, the object is not declared outside the try statement
I have tried a number of ways of getting around it
That doesn't need to be a problem. There's not necessarily need to get around it. Simply use the object within the try statement.
If you really cannot have the try block around the entire lifetime, then this is a use case for std::optional:
std::optional<my_class> maybe_my_object;
try {
maybe_my_object.emplace(string);
} catch(...) {}
The problem is that the instance of this object doesn't always end up in the same object which created it, so deleting all pointers correctly would be easy to do wrong,
A pointer returned by new is correct to delete. In the error case, simply set the pointer to null and there would be no problem. That said, use a smart pointer instead for dynamic allocation, if you were to use this approach.
single_vector.push_back(my_class(string));
single_vector.shrink_to_fit();
Don't push and shrink when you know the number of objects that are going to be in the vector. Use reserve instead if you were to use this approach.
The object creation can fail because a resource is unavailable. It's not the creation which fails; it is a prerequisite which is not fulfilled.
Consequently, separate these two concerns: First obtain all resources and then, if that succeeded, create the object with these resources and use it. The object creation as such in this design cannot fail, the constructor is nothrow; it is trivial boilerplate code (copy data etc.). If, on the other hand, resource acquisition failed, object creation and object use are both skipped: Your problem with existing but unusable objects is gone.
Responding to your edit about try/catch comprising the entire program: Exceptions as error indicators are better suited for things which are done in many places at various times in a program because they guarantee error handling (by default through an abort) while separating it from the normal control flow. This is impossible to do with classic return value examination, which leaves us with a choice between unreadable or unreliable programs.
But if you have long-lived objects which are created only rarely (in your example: only at startup) you don't need exceptions. As you said, constructor exceptions guarantee that only properly initialized objects can be used. But if such an object is only created at startup this danger is low. You check for success one way or another and exit the program which cannot perform its purpose if the initial resource acquisition failed. This way the error is handled where it occurred. Even in less extreme cases (e.g. when an object is created at the beginning of a large function other than main) this may be the simpler solution.
In code, my suggestion looks like this:
struct T2;
struct myEx { myEx(const char *); };
void exit(int);
T1 *acquireResource1(); // e.g. read file
T2 *acquireResource2(); // e.g. connect to db
void log(const char *what);
class ObjT
{
public:
struct RsrcT
{
T1 *mT1;
T2 *mT2;
operator bool() { return mT1 && mT2; }
};
ObjT(const RsrcT& res) noexcept
{
// initialize from file data etc.
}
// more member functions using data from file and db
};
int main()
{
ObjT::RsrcT rsrc = { acquireResource1(), acquireResource2() };
if(!rsrc)
{
log("bummer");
exit(1);
}
///////////////////////////////////////////////////
// all resources are available. "Real" code starts here.
///////////////////////////////////////////////////
ObjT obj(rsrc);
// 1000 lines of code using obj
}
Once again I got cought on expecting a function to return a proper value and then be disapointed .. getting odd behavior and misleading debug-information instead.
It's fairly well known, that you cannot return a local variable from a function and expect it to arrive as you would expect. Testing
int i=2;
int k=4;
return make_pair<int,int>(i*i,k*k);
Does indeed return something respectable. But using more elaborate objects than simple types seems to catch me every time.
So, is there any formality that I can use for discriminating on what can and what cannot be returned safely from a function?
----------- added on edit: ------------
Here is the example that does not work, taken brutally out of context.
Problem-context is a (to be GUI) tree of rectangles for the screen.
Class node inherits from a base (rectangle) containing 3 pointers to plain types (again, used to make values stick) .. the base uses new in constructor
pair<node,node> node_handler::split( vector<node>::iterator& this_node, double ratio, bool as_horizontal ){
//this_node becomes parents to the split-twins
this_node->my_ratio=ratio;
double firstW, firstH;
double secW, secH;
glm::dvec2 afirst, asecond;
if(as_horizontal ){
firstW = *this_node->plWidth*LETTER_PIXEL_WIDTH;
firstH = *this_node->plHeight*LINE_PIXEL_HEIGHT*ratio;
afirst = *this_node->pPoint;
secW = firstW;
secH = LINE_PIXEL_HEIGHT*(*this_node->plHeight)*(1.0d-ratio);
asecond= afirst+glm::dvec2(0.0d, firstH);
}
else{
firstW = ratio*(*this_node->plWidth)*LETTER_PIXEL_WIDTH;
firstH = *this_node->plHeight*LINE_PIXEL_HEIGHT;
afirst = *this_node->pPoint;
secW = (1.0d*ratio)*(*this_node->plWidth)*LETTER_PIXEL_WIDTH;
secH = firstH;
asecond= afirst+glm::dvec2(firstW,0.0d);
}
return make_pair<node,node>( node(afirst ,firstW, firstH) , node(asecond ,secW, secH) ) ;
}
Technically, you can return anything from a function.
Now when you return a pointer or a reference to something that is only local, then you have a problem.
Solutions:
Return copies (OK with copy elision anyway)
Return shared_ptr<>/unique-ptr<> for something that must not be copied.
Return only basic types and pass to the function a reference to an object that might be modified.
Do not create something in the function that needs to be manually destroyed layer (say, a pointer created with new).
It's dawning on me, that classes containing pointer-members reasonably has to have custom copy/assignment operators. I never got to grips with the "rho" variable referred to in the books I read at the time ... "right_hand_object" it must be! That's my epiphany. It was following the business of the constructors and your talk of copyable objects that squeezed this old rho-problem of mine.
I'm sorry for having spread my frustration on you.
I have a Function pass, called firstPass, which does some analysis and populates:
A a;
where
typedef std::map< std::string, B* > A;
class firstPass : public FunctionPass {
A a;
}
typedef std::vector< C* > D;
class B {
D d;
}
class C {
// some class packing information about basic blocks;
}
Hence I have a map of vectors traversed by std::string.
I wrote associated destructors for these classes. This pass works successfully on its own.
I have another Function pass, called secondPass, needing this structure of type A to make some transformations. I used
bool secondPass::doInitialization(Module &M) {
errs() << "now running secondPass\n";
a = getAnalysis<firstPass>().getA();
return false;
}
void secondPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<firstPass>();
AU.setPreservesAll();
}
The whole code compiles fine, but I get a segmentation fault when printing this structure at the end of my first pass only if I call my second pass (since B* is null).
To be clear:
opt -load ./libCustomLLVMPasses.so -passA < someCode.bc
prints in doFinalization() and exits successfully
opt -load ./libCustomLLVMPasses.so -passA -passB < someCode.bc
gives a segmentation fault.
How should I wrap this data structure and pass it to the second pass without issues? I tried std::unique_ptr instead of raw ones but I couldn't make it work. I'm not sure if this is the correct approach anyway, so any help will be appreciated.
EDIT:
I solved the problem of seg. fault. It was basically me calling getAnalysis in doInitialization(). I wrote a ModulePass to combine my firstPass and secondPass whose runOnModule is shown below.
bool MPass::runOnModule(Module &M) {
for(Function& F : M) {
errs() << "F: " << F.getName() << "\n";
if(!F.getName().equals("main") && !F.isDeclaration())
getAnalysis<firstPass>(F);
}
StringRef main = StringRef("main");
A& a = getAnalysis<firstPass>(*(M.getFunction(main))).getA();
return false;
}
This also gave me to control the order of the functions processed.
Now I can get the output of a pass but cannot use it as an input to another pass. I think this shows that the passes in llvm are self-contained.
I'm not going to comment on the quality of the data structures based on their C++ merit (it's hard to comment on that just by this minimal example).
Moreover, I wouldn't use the doInitialization method, if the actual initialization is that simple, but this is a side comment too. (The doc does not mention anything explicitly about it, but if it is ran once per Module while the runOn method is ran on every Function of that module, it might be an issue).
I suspect that the main issue seems to stem from the fact A a in your firstPass is bound to the lifetime of the pass object, which is over once the pass is done. The simplest change would be to allocate that object on the heap (e.g. new) and return a pointer to it when calling getAnalysis<firstPass>().getA();.
Please note that using this approach might require manual cleanup if you decide to use a raw pointer.
How can I calculate a hash/checksum/fingerprint of an object in c++?
Requirements:
The function must be 'injective'(*). In other words, there should be no two different input objects, that return the same hash/checksum/fingerprint.
Background:
I am trying to come up with a simple pattern for checking whether or not an entity object has been changed since it was constructed. (In order to know which objects need to be updated in the database).
Note that I specifically do not want to mark the object as changed in my setters or anywhere else.
I am considering the following pattern: In short, every entity object that should be persisted, has a member function "bool is_changed()". Changed, in this context, means changed since the objects' constructor was called.
Note: My motivation for all this is to avoid the boilerplate code that comes with marking objects as clean/dirty or doing a member by member comparison. In other words, reduce risk of human error.
(Warning: psudo c++ code ahead. I have not tried compiling it).
class Foo {
private:
std::string my_string;
// Assume the "fingerprint" is of type long.
long original_fingerprint;
long current_fingerprint()
{
// *** Suggestions on which algorithm to use here? ***
}
public:
Foo(const std::string& my_string) :
my_string(my_string)
{
original_fingerprint = current_fingerprint();
}
bool is_changed() const
{
// If new calculation of fingerprint is different from the one
// calculated in the constructor, then the object has
// been changed in some way.
return current_fingerprint() != original_fingerprint;
}
void set_my_string(const std::string& new_string)
{
my_string = new_string;
}
}
void client_code()
{
auto foo = Foo("Initial string");
// should now return **false** because
// the object has not yet been changed:
foo.is_changed();
foo.set_my_string("Changed string");
// should now return **true** because
// the object has been changed:
foo.is_changed();
}
(*) In practice, not necessarily in theory (like uuids are not unique in theory).
You can use the CRC32 algorithm from Boost. Feed it with the memory locations of the data you want to checksum. You could use a hash for this, but hashes are cryptographic functions intended to guard against intentional data corruption and are slower. A CRC performs better.
For this example, I've added another data member to Foo:
int my_integer;
And this is how you would checksum both my_string and my_integer:
#include <boost/crc.hpp>
// ...
long current_fingerprint()
{
boost::crc_32_type crc32;
crc32.process_bytes(my_string.data(), my_string.length());
crc32.process_bytes(&my_integer, sizeof(my_integer));
return crc32.checksum();
}
However, now we're left with the issue of two objects having the same fingerprint if my_string and my_integer are equal. To fix this, we should include the address of the object in the CRC, since C++ guarantees that different objects will have different addresses.
One would think we can use:
process_bytes(&this, sizeof(this));
to do it, but we can't since this is an rvalue and thus we can't take its address. So we need to store the address in a variable instead:
long current_fingerprint()
{
boost::crc_32_type crc32;
void* this_ptr = this;
crc32.process_bytes(&this_ptr, sizeof(this_ptr));
crc32.process_bytes(my_string.data(), my_string.length());
crc32.process_bytes(&my_integer, sizeof(my_integer));
return crc32.checksum();
}
Such a function does not exist, at least not in the context that you are requesting.
The STL provides hash functions for basic types (std::hash), and you could use these to implement a hash function for your objects using any reasonable hashing algorithm.
However, you seem to be looking for an injective function, which causes a problem. Essentially, to have an injective function, it would be necessary to have an output of size greater or equal to that of the object you are considering, since otherwise (from the pigeon hole principle) there would be two inputs that give the same output. Given that, the most sensible option would be to just do a straight-up comparison of the object to some sort of reference object.
I need to check if my object Course is in a safe empty state.
Here is my failed attempt:
const bool Course::isEmpty() const {
if (Course() == nullptr) {
return true;
}
else {
return false;
}
}
Constructors:
Course::Course() {
courseTitle_ = new char[21]; // name
courseTitle_ = '\0';
credits_ = 0;//qtyNeeded
studyLoad_ = 0;//quantity
strcpy(courseCode_, "");//sku
}
Course::Course(const char* courseCode, const char* courseTitle, int credits , int studyLoad ) {
strcpy(courseCode_, courseCode);
courseTitle_ = new char[21];
strcpy(courseTitle_, courseTitle);
studyLoad_ = studyLoad;
credits_ = credits;
}
Apprently, Doing course() == nullptr is not truly checking if the object is in safe empty state, also checking individual variables if they are set to 0 will not work in my program. i need to check if the entire object was set to a safe empty state.
Edit: Some of you are asking what my empty() function is suppose to use. There is a tester that is suppose to test if my isEmpty() works well.
bool isEmptyTest0() {
// empty test
sict::Course c0;
return c0.isEmpty();
}
bool isEmptyTest1() {
// empty test
sict::Course c0("", "title", 3, 3);
return c0.isEmpty();
}
bool isEmptyTest2() {
// empty test
sict::Course c0("code", "", 3, 3);
return c0.isEmpty();
}
bool isEmptyTest3() {
// empty test
sict::Course c0("code", "title", -1, 3);
return c0.isEmpty();
}
bool isEmptyTest4() {
// empty test
sict::Course c0("code", "title", 3, -1);
return c0.isEmpty();
}
bool regularInitTest() {
// regular
sict::Course c5("OOP244", "Object-Oriented Programming in C++", 1, 4);
return (!c5.isEmpty()
&& !strcmp("OOP244", c5.getCourseCode())
&& !strcmp("Object-Oriented Programming in C++", c5.getCourseTitle())
&& (c5.getCredits() == 1)
&& c5.getStudyLoad() == 4
);
}
Note that in regularInitTest() my assignment operators work fine, but it never passes !c5.isEmpty() because it fails. Hopefully i explained it correctly.
Most probably here is what you should do to make the tests pass.
In the 2nd (4-argument) constructor, do some checking of the input, e.g. check if credits is positive. Do check all arguments for all possible errors you can imagine, including those in isEmptyTest0..4. If there is an error, initialize the object the same way as the 1st (0-argument) constructor does. If there is no error, initialize the data members from the arguments.
Here is how to implement the isEmpty method: it should return true iff all the data members of the object have the empty/zero/default value, as initialized by the 1st (0-argument) constructor.
The notion safe empty state in itself still doesn't make sense, but the concept the professor is trying to teach does make sense. I'll try to summarize my understanding here. Constructors can receive invalid arguments, based on which it's not possible to initialize a meaningful and valid object. The programmer should add code for error checking and handling everywhere in the program, including constructors. There are multiple approaches to do input validation and error handling in constructors, e.g. 1. throwing an exception; 2. aborting the entire program with an error message; 3. initializing the object to a special, invalid state; 4. initializing the object to a special, empty state. (This is also an option, but it's strongly disrecommended: 5. keep some data members of the object uninitialized.) Each of these approaches have pros and cons. In this assignment, the professor wants you to implement #4. See the 2nd paragraph in my answer how to do it.
When the professor asks for a safe empty state, he most probably means that you should be doing input validation in the constructor, and in case of an error doing #4 rather than #5.
I agree with pts that safe empty state is ill-defined.
The missing principle, it seems to me after reading the comments, is Resource Acquisition Is Initialization (RAII). A constructor is a transaction, in a way: you get either
a valid object, or
an exception.
Valid here is defined by the class. Usually it means that the passed parameters were incorporated into the object, and all required resources were successfully allocated and/or found.
Aborting the program is rarely an option, and returning an error (from a constructor) never is. Constructing an invalid object is usually done only in environments where exceptions are prohibited.
There is a special case: the default constructor. Sometimes it's desirable to "make an empty" thing that will be fully initialized later.
Consider std::string. It can be constructed with a value, and throws an exception if memory cannot be allocated. Or it can be constructed without a value, and later assigned one. Your class could be similar, in which case safe empty just means a state that the user would be happy to destroy when calling the "init" function. You don't have to test every member variable; you just have to check something that will be true only for a completely initialized object.
Then there's the question of "is valid". An "empty" object can be "initialized", but it can't be used. It's not "valid" for use until fully initialized, whether at construction, or via the 2-step with a default constructor and a subsequent "init".
There is a widely accepted idiom for testing whether an object is "is valid" or not: a user-defined conversion to void *:
...
public:
operator void*() { return is_valid()? this : nullptr; }
...
where is_valid() may be a private function. With that in place, the user can test his instantiated object thus:
class A;
A foo();
...
if (!foo) { foo.open(...); }
I know I haven't answered your question, exactly. I hope I've provided some background that makes it easier for your to answer it yourself.