Referencing AST nodes after construction with Visitor pattern - c++

I have a memory management question. For one of my projects I am building an interpreter for a small programming language. One of the first steps is to model and build an Abstract Syntax Tree.
As of now, I'm using smart pointers to manage the lifetime of nodes, and I figured that every parent node is the owner of it's children, but it also must be shared with the environment (for example, to know which part of the tree a method's body belongs to) and with the garbage collector, which must keep a list of all references to enact the naïve mark and sweep algorithm. Therefore, I am using std::shared_ptr to keep track of the references. For instance, here's an example of a Block node, which basically represents a lambda expression:
#ifndef NAYLANG_BLOCK_H
#define NAYLANG_BLOCK_H
#include <model/ast/expressions/Expression.h>
#include <model/ast/declarations/Declaration.h>
#include <memory>
#include <vector>
namespace naylang {
#define BlockPtr std::shared_ptr<Block>
class Block : public Expression {
std::vector<std::shared_ptr<Statement>> _body;
std::vector<std::shared_ptr<Declaration>> _params;
public:
Block() = default;
void accept(Evaluator &evaluator) override;
const std::vector<std::shared_ptr<Statement>> &body() const;
const std::vector<std::shared_ptr<Declaration>> &params() const;
void addStatement(std::shared_ptr<Statement> statement);
void addParameter(std::shared_ptr<Declaration> param);
};
} // end namespace naylang
#endif //NAYLANG_BLOCK_H
As you can see, this node is the owner of all it's parameters and body expressions, and has accessors so that the Evaluator can traverse the tree.
Now, the problem comes when trying to have nodes that are bound at evaluation time to other nodes, for example:
#ifndef NAYLANG_REQUEST_H
#define NAYLANG_REQUEST_H
#include <model/ast/expressions/Expression.h>
#include <string>
#include <memory>
#include <vector>
#include <model/ast/declarations/MethodDeclaration.h>
namespace naylang {
class Request : public Expression {
std::string _name;
std::vector<ExpressionPtr> _params;
// We use naked pointers because we don't want to worry
// about memory management, and there is no ownership
// with the declaration.
const MethodDeclaration *_binding;
public:
Request(const std::string &methodName);
Request(const std::string &methodName, const std::vector<ExpressionPtr> params);
void accept(Evaluator &evaluator) override;
void bindTo(const MethodDeclaration *_binding);
const std::string &method() const;
const std::vector<ExpressionPtr> &params() const;
const MethodDeclaration &declaration() const;
};
} // end namespace naylang
#endif //NAYLANG_REQUEST_H
As you can see, bindTo() is called when a BindingEvaluator (subclass of Evaluator) evaluates a Request object, long after it's constructed. However, I am really not sure about what the _binding parameter should look like. Here's a part of the Evaluator interface:
#ifndef NAYLANG_EVALUATOR_H
#define NAYLANG_EVALUATOR_H
#include <model/ast/Statement.h>
namespace naylang {
class Request;
class Block;
class Evaluator {
public:
Evaluator() = default;
virtual ~Evaluator() = default;
// Methods left blank to be overridden by the subclasses.
// For example, a Binding Evaluator might be only interested in
// evaluating VariableReference and Request Statements
virtual void evaluate(Request &expression) {}
virtual void evaluate(Block &expression) {}
};
}
#endif //NAYLANG_EVALUATOR_H
Here's my rationale:
The reference should be polymorphic, and therefore should be some kind of pointer.
The reference does not denote ownership, and therefore it should not be a std::shared_ptr.
In addition, we need the Visitor pattern, so every node has the function void accept(Evaluator &evaluator);. As a node cannot return a shared_ptrof itself, we cannot change the interface to something like virtual void evaluate(std::shared_ptr<Request> &expression) {}.
Thus, naked pointers. I really want to get this right before moving on, because it's a ton of code to change every time I rethink it (ASTs are verbose...)
Thank you in advance.

Related

Using `std::call_once` for lazy initialization

Briefly, I have a class that lazily initializes one of its data members and I'd like to figure out the best way to do this in a multithreaded environment.
In more detail, my class currently looks something like this:
#include <algorithm>
#include <optional>
#include <vector>
class A_single_threaded
{
public:
bool query(const int val) const
{
// if opt_vec is not initialized, do that now
if (!(opt_vec.has_value()))
initialize_vec();
// return true if opt_vec contains val
return std::find(std::cbegin(opt_vec), std::cend(opt_vec), val) != std::cend(opt_vec);
}
private:
mutable std::optional<std::vector<int>> opt_vec;
// sets opt_vec
void initialize_vec() const;
};
initialize_vec is the only method that modifies opt_vec, and query is the only method that calls initialize_vec. opt_vec can potentially be empty after initialize_vec returns, so giving the data member std::optional type helps distinguish when it's unset and when it's set and empty. In other instances opt_vec winds up being large and initializing it is time-consuming. And since not every A_single_threaded instance will need to run query anyway, it makes sense to avoid initializing opt_vec until a user call to query makes clear that initialization is necessary.
The approach above seems OK for a single thread, but I think it isn't naturally multithread-able. The calls to opt_vec.has_value() and initialize_vec are necessarily unsynchronized, and the gap between their return times allows for a data race that I don't think can be fixed with a mutex. Instead I think the correct solution involves replacing the std::optional with std::call_once, something like the below:
#include <algorithm>
#include <mutex>
#include <vector>
class A_multi_threaded
{
public:
bool query(const int val) const
{
// if opt_vec is not initialized, do that now
std::call_once(opt_vec_flag, initialize_vec, this);
// return true if opt_vec contains val
return std::find(std::cbegin(opt_vec), std::cend(opt_vec), val) != std::cend(opt_vec);
}
private:
mutable std::vector<int> opt_vec;
mutable std::once_flag opt_vec_flag;
// sets opt_vec
void initialize_vec() const;
};
I'd appreciate answers to a couple of questions:
Is the implementation I sketched for A_multi_threaded actually thread-safe?
Will A_multi_threaded have the same behavior as A_single_threaded in a single-threaded environment?
Is there a sensible implementation of A_multi_threaded that mimics the implementation of A_single_threaded?

Passing std::vector of std::shared_ptr, not updating the objects

Okay, I may be doing this wrong, but I am at my wits end.
I have a vector of shared_ptr of my node class that I pass around for various things, my node class has a vector of share_ptr of it neighbors of type node.
I have a class that generates the mesh of nodes for me, and returns a std::vector<std::shared_ptr<Node>> nodes, and a significant std::shared_ptr<Node> significant node.
I then pass this vector into an indexer that creates a second list that is a subset of the first of about 10% the size, which it returns as std::vector<std::shared_ptr<Node>> indexedNodes.
After these are created, I pass them into another object that keeps them for later reference.
Then a modifier class gets the a single random node from the indexedNodes, and uses that to walk through the node neighbors modifying a height value.
Later, when I go to export these out, the values show up as 0/initialized.
Somethings to note, I pass the data into the functions and return with just std::vector<std::shared_ptr<Node>> which I figured is my issue, I am just not sure how to properly pass a container of my shared_ptr so that I don't make copies.
If more info is needed, let me know. I am looking for an example or a reference that I can understand.
Sorry for the code, it is not beautful, and I have it using Dynamically Loaded Libraries.
The function where the work is done:
void cruthu::Cruthu::Run() {
std::shared_ptr<cruthu::ITeraGen> teraGen(this->mSettings.TeraGen.Factory->DLGetInstance());
std::vector<std::shared_ptr<cruthu::Node>> nodes(teraGen->Create());
std::shared_ptr<cruthu::Node> significantNode(teraGen->GetSignificantNode());
std::vector<std::shared_ptr<cruthu::IIndexer>> indexers;
for(const auto indexer : this->mSettings.Indexers) {
indexers.push_back(indexer.Factory->DLGetInstance());
}
std::vector<std::shared_ptr<cruthu::Node>> indexedNodes(indexers.at(0)->Index(nodes));
std::shared_ptr<cruthu::ITera> tera(this->mSettings.Tera.Factory->DLGetInstance());
tera->SetNodes(nodes);
tera->SetIndexedNodes(indexedNodes);
tera->SetSignificantNode(significantNode);
for(const auto & formaF : this->mSettings.Formas) {
std::shared_ptr<cruthu::IForma> forma(formaF.Factory->DLGetInstance());
forma->SetNode(tera->GetIndexedNode());
forma->Modify();
std::cout << std::to_string(tera->GetIndexedNode()->GetHeight()) << std::endl;
}
this->CreateImage(tera);
}
TeraGen:
#ifndef CRUTHU_ITERAGEN_HPP
#define CRUTHU_ITERAGEN_HPP
#include <cruthu/Node.hpp>
#include <vector>
namespace cruthu {
class ITeraGen {
public:
virtual ~ITeraGen() = default;
virtual std::vector<std::shared_ptr<cruthu::Node>> Create() = 0;
virtual std::shared_ptr<cruthu::Node> GetSignificantNode() = 0;
};
} // namespace cruthu
#endif
Tera:
#ifndef CRUTHU_ITERA_HPP
#define CRUTHU_ITERA_HPP
#include <cruthu/IIndexer.hpp>
#include <cruthu/Node.hpp>
#include <memory>
#include <vector>
namespace cruthu {
class ITera {
public:
virtual ~ITera() = default;
virtual void SetNodes(std::vector<std::shared_ptr<cruthu::Node>>& nodes) = 0;
virtual void SetIndexedNodes(std::vector<std::shared_ptr<cruthu::Node>>& indexedNodes) = 0;
virtual void SetSignificantNode(std::shared_ptr<cruthu::Node> significantNode) = 0;
virtual std::vector<std::shared_ptr<cruthu::Node>>& GetNodes() = 0;
virtual std::vector<std::shared_ptr<cruthu::Node>>& GetIndexedNodes() = 0;
virtual std::shared_ptr<cruthu::Node> GetIndexedNode() = 0;
};
} // namespace cruthu
#endif
Indexer:
#ifndef CRUTHU_IINDEXER_HPP
#define CRUTHU_IINDEXER_HPP
#include <cruthu/Node.hpp>
#include <memory>
#include <vector>
namespace cruthu {
class IIndexer {
public:
virtual ~IIndexer() = default;
virtual std::vector<std::shared_ptr<cruthu::Node>> Index(std::shared_ptr<cruthu::Node> node) = 0;
virtual std::vector<std::shared_ptr<cruthu::Node>> Index(std::vector<std::shared_ptr<cruthu::Node>>& nodes) = 0;
};
} // namespace cruthu
#endif
Forma:
#ifndef CRUTHU_IFORMA_HPP
#define CRUTHU_IFORMA_HPP
#include <cruthu/Node.hpp>
namespace cruthu {
class IForma {
public:
virtual ~IForma() = default;
virtual void SetNode(std::shared_ptr<cruthu::Node> node) = 0;
virtual void Modify() = 0;
};
} // namespace cruthu
#endif
I did update and try adding references in in between, which is why they now have references in places. I still have the same issue.
As user Remy Lebeau stated please provide a minimal, complete and verifiable example.
As you stated you are passing a std::vector<std::shared_ptr<Node>> around from one class to another or from one function to another and that they are not updating and are 0 initialized. From the behavior that you are describing I then have a question for you, I'm posting this as an answer as it would be too long for a comment.
Does your function declaration/definition that accepts the vector or shared pointers above look something like this:
void someFunc( std::vector<shared_ptr<Node> nodes ) { ... }
or does it look something like this:
void someFunc( std::vector<shared_ptr<Node>& nodes ) { ... }
I ask this because it makes a difference if you are passing the container around by value as opposed to by reference.
This is not (yet) a answer, but questions to pin down the problem, since not enough implementation is provided.
One possible problem (hard to say without seeing the implementation...) is that you create the nodes in top of the Run() function:
std::vector<std::shared_ptr<cruthu::Node>> nodes(teraGen->Create());
Then you pass that function as reference in this call:
tera->SetNodes(nodes);
What does tera do with the nodes? Passing by reference means that the count of the shared_ptr:s isn't incremented.
what does this->CreateImage(tera) do?
Are the nodes used after Run() has finished?
I could not get it going with the comments from above, this is mainly my problem of not being able to provide adequate information.
With that said, I re-worked the code to instead pass a cruthu::Tera object around as a shared pointer and exposed the vectors as public members of the class. This is something I will revisit at a later date, as this implementation is not something I feel happy about.
The code is on github, unfortunately it is my thesis work, so I could no longer wait.
If people feel the desire to still attempt an answer, I will work with them.

How to manage and process objects of a class only created on the heap?

I have an Object class and ObjectManager class that is supposed to hold pointers to Objects created on the heap and is in charge of housekeeping. i.e., (I don't want to have pointers to temporary Objects, for instance when an object is passed to a function by value). I'd like to do some process on the items in the ObjectManager class and later release the memory.
Please consider the following files:
"Object.h" file
#pragma once
#include<algorithm>
#include "ObjectManager.h"
class ObjectManager;
class Object{
private:
int value;
static bool heap_flag;
public:
Object() {
if (heap_flag) {
heap_flag = false;
ObjectManager::vo.push_back(this);
}
}
~Object() {}
void* operator new (size_t sz){
heap_flag = true;
return malloc(sz);
}
void setValue(int v) { value = v; }
};
and "ObjectManager.h"
#pragma once
#include "Object.h"
#include <vector>
using namespace std;
class Object;
class ObjectManager{
private:
ObjectManager() {}
public:
static vector <Object*> vo; // vector that holds pointers to all objects created on heap
static void releaseObjects() {
size_t index = 0;
for (auto o : vo){
// iterate through the vector and delete the object create on heap
delete o;
vo[index] = NULL;
index++;
}
}
};
finally in the client code:
#include <iostream>
#include "Object.h"
#include "ObjectManager.h"
using namespace std;
bool Object::heap_flag = false;
vector<Object*> ObjectManager::vo;
void process_Heap_objects (vector<Object*>) {
// ... code to iterate through the elements of a vector and do some process
}
int main() {
Object o; // created on stack
Object* po = new Object(); // created on heap
ObjectManager::vo[0]->setValue(100);
process_Heap_Objects(ObjectManager::vo);
ObjectManager::releaseObjects();
return 0;
}
when I compile this file I get the following warning in VS2013 -> "warning C4150: deletion of pointer to incomplete type 'Object'; no destructor called
1> Objectmanager.h: see declaration of 'Object'"
the code compiles fine and works as expected though.
two questions:
1- what does the warning mean?
2- is this a good design? is there a better way to achieve this? what are your thoughts?
You can't call a destructor on a forward declared class.
You must put the destructor call in a compilation unit where it can see the declaration of the destructor (e.g. where you #include the Object.h).
Also, stylistic tips:
On pass by value:
If the object is not going to be mutated or copied by the method, pass by const ref foo (const Object& bar) {}
If the object is going to be mutated but not copied by the method, pass by reference foo (Object& bar) {}
If the object is going to be copied by the method, pass by value
If the method takes possession of the object, pass by pointer
pragma once is not officially part of the standard, and rarely offers faster compilation in current generation compilers. Most style guides recommended to use include guards:
#ifndef SOME_NAME
#define SOME_NAME
... body ...
#endif
Your problem is you are defining and implementing both Object and ObjectManager in the header files. This creates a circular dependency because each header file includes the other.
A better approach would be to only have the class definitions in the headers and the bodies of the methods in cpp files.
The warning you're receiving is due to Object not being defined. Because you have #pragma once you aren't seeing the error you should be seeing (the effect of the circular dependency.) This is preventing the ObjectManager from seeing how the Object class is defined.

Initialize a static pointer to parent in class object in C++

I'd like to be able to access all of the nodes in my Mesh class from the Element class. I believe I have both classes correctly setup to do this, but I'm not clear on the best practice to initialize the static pointer to Mesh inside of Element. What do you normally do for this?
Thanks!
The code (so far)
//Mesh.h
#pragma once
#ifndef MESH_H
#define MESH_H
#include <vector>
#include <map>
#include "Eigen/Eigen"
#include "Element.h"
#include "Node.h"
class Mesh {
public:
std::vector< Node* > m_Nodes;
std::vector< Element* > m_Elements;
Eigen::MatrixXd m_K;
Eigen::VectorXd m_F;
Eigen::VectorXd m_u;
void Mesh::LoadFile(wchar_t* MeshFile);
};
#endif
//Element.h
#pragma once
#ifndef ELEMENT_H
#define ELEMENT_H
#include <vector>
class Mesh;
class Element {
public:
static Mesh * m_Parent;
static int m_ElementCount;
int m_ElementIndex;
std::vector< int > m_ElementNodes;
};
#endif
I lack reputation to comment on your post, so I have to post here.
Why not just have each Element instance have its own pointer to its parent? Each pointer can point to the same Mesh object. Is the memory savings really that important?
Even if you do end up with a singleton mesh, if, in the future, you do want to extend to multiple meshes, each with a group of elements, the change will be easier to do.
If you do need to have a static pointer to the parent mesh, you could encapsulate construction of the elements in a method of Mesh, and set the parent pointer to this in Mesh's constructor.
You have to define the static pointer (in any source file; not in *.h):
Mesh* Element::m_Parent; // initialized to NULL by the system
In addition, to initialize it, use the following syntax :
Element::m_Parent = whatever;
I tried not to change anything in your class design, but you should think about which member functions and fields should be static and which should not; the design doesn't look convenient now.

what is a good place to put a const in the following C++ statement

Consider the following class member:
std::vector<sim_mob::Lane *> IncomingLanes_;
the above container shall store the pointer to some if my Lane objects. I don't want the subroutins using this variable as argument, to be able to modify Lane objects.
At the same time, I don't know where to put 'const' keyword that does not stop me from populating the container.
could you please help me with this?
thank you and regards
vahid
Edit:
Based on the answers i got so far(Many Thanks to them all) Suppose this sample:
#include <vector>
#include<iostream>
using namespace std;
class Lane
{
private:
int a;
public:
Lane(int h):a(h){}
void setA(int a_)
{
a=a_;
}
void printLane()
{
std::cout << a << std::endl;
}
};
class B
{
public:
vector< Lane const *> IncomingLanes;
void addLane(Lane *l)
{
IncomingLanes.push_back(l);
}
};
int main()
{
Lane l1(1);
Lane l2(2);
B b;
b.addLane(&l1);
b.addLane(&l2);
b.IncomingLanes.at(1)->printLane();
b.IncomingLanes.at(1)->setA(12);
return 1;
}
What I meant was:
b.IncomingLanes.at(1)->printLane()
should work on IncomingLanes with no problem AND
b.IncomingLanes.at(1)->setA(12)
should not be allowed.(In th above example none of the two mentioned methods work!)
Beside solving the problem, I am loking for good programming practice also. So if you think there is a solution to the above problem but in a bad way, plase let us all know.
Thaks agian
A detour first: Use a smart pointer such shared_ptr and not raw pointers within your container. This would make your life a lot easy down the line.
Typically, what you are looking for is called design-const i.e. functions which do not modify their arguments. This, you achieve, by passing arguments via const-reference. Also, if it is a member function make the function const (i.e. this becomes const within the scope of this function and thus you cannot use this to write to the members).
Without knowing more about your class it would be difficult to advise you to use a container of const-references to lanes. That would make inserting lane objects difficult -- a one-time affair, possible only via initializer lists in the ctor(s).
A few must reads:
The whole of FAQ 18
Sutter on const-correctness
Edit: code sample:
#include <vector>
#include <iostream>
//using namespace std; I'd rather type the 5 characters
// This is almost redundant under the current circumstance
#include <vector>
#include <iostream>
#include <memory>
//using namespace std; I'd rather type the 5 characters
// This is almost redundant under the current circumstance
class Lane
{
private:
int a;
public:
Lane(int h):a(h){}
void setA(int a_) // do you need this?
{
a=a_;
}
void printLane() const // design-const
{
std::cout << a << std::endl;
}
};
class B
{
// be consistent with namespace qualification
std::vector< Lane const * > IncomingLanes; // don't expose impl. details
public:
void addLane(Lane const& l) // who's responsible for freeing `l'?
{
IncomingLanes.push_back(&l); // would change
}
void printLane(size_t index) const
{
#ifdef _DEBUG
IncomingLanes.at( index )->printLane();
#else
IncomingLanes[ index ]->printLane();
#endif
}
};
int main()
{
Lane l1(1);
Lane l2(2);
B b;
b.addLane(l1);
b.addLane(l2);
//b.IncomingLanes.at(1)->printLane(); // this is bad
//b.IncomingLanes.at(1)->setA(12); // this is bad
b.printLane(1);
return 1;
}
Also, as Matthieu M. suggested:
shared ownership is more complicated because it becomes difficult to
tell who really owns the object and when it will be released (and
that's on top of the performance overhead). So unique_ptr should be
the default choice, and shared_ptr a last resort.
Note that unique_ptrs may require you to move them using std::move. I am updating the example to use pointer to const Lane (a simpler interface to get started with).
You can do it this way:
std::vector<const sim_mob::Lane *> IncomingLanes_;
Or this way:
std::vector<sim_mob::Lane const *> IncomingLanes_;
In C/C++, const typename * and typename const * are identical in meaning.
Updated to address updated question:
If really all you need to do is
b.IncomingLanes.at(1)->printLane()
then you just have to declare printLane like this:
void printLane() const // Tell compiler that printLane doesn't change this
{
std::cout << a << std::endl;
}
I suspect that you want the object to be able to modify the elements (i.e., you don't want the elements to truly be const). Instead, you want nonmember functions to only get read-only access to the std::vector (i.e., you want to prohibit changes from outside the object).
As such, I wouldn't put const anywhere on IncomingLanes_. Instead, I would expose IncomingLanes_ as a pair of std::vector<sim_mob::Lane *>::const_iterators (through methods called something like GetIncomingLanesBegin() and GetIncomingLanesEnd()).
you may declare it like:
std::vector<const sim_mob::Lane *> IncomingLanes_;
you will be able to add, or remove item from array, but you want be able to change item see bellow
IncomingLanes_.push_back(someLine); // Ok
IncomingLanes_[0] = someLine; //error
IncomingLanes_[0]->some_meber = someting; //error
IncomingLanes_.erase(IncomingLanes_.end()); //OK
IncomingLanes_[0]->nonConstMethod(); //error
If you don't want other routines to modify IncomingLanes, but you do want to be able to modify it yourself, just use const in the function declarations that you call.
Or if you don't have control over the functions, when they're external, don't give them access to IncomingLanes directly. Make IncomingLanes private and provide a const getter for it.
I don't think what you want is possible without making the pointers stored in the vector const as well.
const std::vector<sim_mob::Lane*> // means the vector is const, not the pointer within it
std::vector<const sim_mob::Lane*> // means no one can modify the data pointed at.
At best, the second version does what you want but you will have this construct throughout your code where ever you do want to modify the data:
const_cast<sim_mob::Lane*>(theVector[i])->non_const_method();
Have you considered a different class hierarchy where sim_mob::Lane's public interface is const and sim_mob::Really_Lane contains the non-const interfaces. Then users of the vector cannot be sure a "Lane" object is "real" without using dynamic_cast?
Before we get to const goodness, you should first use encapsulation.
Do not expose the vector to the external world, and it will become much easier.
A weak (*) encapsulation here is sufficient:
class B {
public:
std::vector<Lane> const& getIncomingLanes() const { return incomingLanes; }
void addLane(Lane l) { incomlingLanes.push_back(l); }
private:
std::vector<Lane> incomingLanes;
};
The above is simplissime, and yet achieves the goal:
clients of the class cannot modify the vector itself
clients of the class cannot modify the vector content (Lane instances)
and of course, the class can access the vector content fully and modify it at will.
Your new main routine becomes:
int main()
{
Lane l1(1);
Lane l2(2);
B b;
b.addLane(l1);
b.addLane(l2);
b.getIncomingLanes().at(1).printLane();
b.getIncomingLanes().at(1).setA(12); // expected-error\
// { passing ‘const Lane’ as ‘this’ argument of
// ‘void Lane::setA(int)’ discards qualifiers }
return 1;
}
(*) This is weak in the sense that even though the attribute itself is not exposed, because we give a reference to it to the external world in practice clients are not really shielded.