Declaration Derivation in Language Parser

Declaration Derivation in Language Parser - c++

I'm trying to write a lightweight library for parsing C source code.
Here's the way I've considered writing a declaration parser:
Decl CParser::ParseDecl(std::istream& in);
Or something like:
void CParser::Parse(std::istream& in);
// and
virtual void CParser::OnDecl(const Decl& decl);
Where Decl is a base class that may be inherited by either a TypedefDecl, FunctionDecl, or VariableDecl.
Is it okay that client code will have to cast into a derived class to get more information about the declaration? Or is there a better way to do this?
Edit:
The function itself isn't very well defined yet, it may actually be a callback, like CParser::OnDecl(const Decl& decl); which may be overloaded by a derived class like CFomatter: public CParser or something. That's not entirely part of the question.
I'm really just curious if it's okay that a client of the library will have to cast the Decl object. There's a lot of different declaration types in the C language (even more in C++) and it seems like writing a callback or a parser for each one of them would be just as bad as having to derive the base class.

First, of all you have to avoid slicing, for example by returning a a pointer.
Decl* CParser::ParseDecl(std::istream& in);
Then, generally speaking, forcing a client to cast a return value is a symptom of a bad design. What if he casts to the wrong type ? How shall he know to which type he has to cast ? If the user makes the wrong cast, it's undefined behaviour (and extremely nasty bugs).
CParser cp;
...
Decl* token = cp.ParseDecl(ifs);
FunctionDecl *ftoken;
VariableDecl *vtoken;
if (????) { // <============ how does your user know ?
ftoken = static_cast<FunctionDecl>(token);
//... do something with ftoken specific to function declarations
}
else if (????) {
vtoken = static_cast<VariableDecl>(token);
//... do something specific for variable declarations
}
To make the things more robust, you should at least make the type polymorphic, by having one or more virtual functions. Then your client can use the safer dynamic casting and make the right decision (which returns nullptr in case of wrong casting):
...
if (ftoken = dynamic_cast<FunctionDecl>(token)) { // If wrong type fotken will be set to nullptr and you go to the else part
//... do something with ftoken specific to function declarations
}
else if (vtoken = dynamic_cast<VariableDecl>(token)) { // attention: it's = and not ==
//... do something specific for variable declarations
}
This would be an acceptable design. But if you have polymorphic types, you could rethink your design by making use of this polymorphism, instead of forcing user to take care of casting. One possible way could for example be to define class specific functions as polymorphic ones:
class Decl {
public:
virtual void display_details() { cout << "No detail for this token"; }
...
};
class VariableDecl : public Decl {
...
display_details() { cout<<"This is variable "<<name; }
};
class FunctionDecl : public Decl {
...
display_details() { cout<<"This is function "<<name<<" with "<<nparam<< " parameters"; }
};
The user could then just refer to the specific building blocs, without worying to much about the real type of the object:
Decl* token = cp.ParseDecl(ifs);
token->display_details();
Another popular design for more complex situations is the visitor design pattern. This is for example used by boost::variant : it could be worth looking at their examples, event if you don't intend to use this library.

Related

C++ Dynamic Dispatch Function

I'm trying to create an overloaded function that will be called with the dynamic type of an object. I try to do this without interfering with the actual class structure underneath, as I don't have direct access (i.e. I cannot add virtual methods, etc.)
As a concrete example, let's think of an AST class structure that looks somewhat like this:
class ASTNode {}; // this one is fully abstract; i.e. there's a virtual void method() = 0;
class Assignment : ASTNode {};
class Expression : ASTNode {};
class StringExpr : Expression {};
class MathExpr : Expression {};
I want to write a function act that will take an instance of ASTNode as parameter and, depending on its actual dynamic type do something different.
The call will be something like this
std::shared_ptr<ASTNode> parsedAST = get_a_parsed_ASTNode(); // ... received from some parser or library
act(parsedAST);
Then, I want to act, depending on the dynamic type of the ASTNode.
void act(std::shared_ptr<MathExpr> expr)
{
// Do something with Math expressions, e.g. evaluate their value
};
void act(std::shared_ptr<StringExpr> expr)
{
// Do something with String expressions, e.g. write their value to the log
};
void act(std::shared_ptr<Expression> expr)
{
// do something with other types of expressions (e.g. Boolean expressions)
};
Currently though, I cannot call since they dynamic type will be maybe not the ``most concrete type''. Instead, I have to manually create a dispatcher manually as follows, but the method is a bit silly in my opinion, since it does literally nothing else but dispatch.
void act(std::shared_ptr<ASTNode> node_ptr)
{
if(std::shared_ptr<MathExpr> derived_ptr = std::dynamic_pointer_cast<MathExpr>(node_ptr))
{
act(derived_ptr);
}
else if(std::shared_ptr<StringExpr> derived_ptr = std::dynamic_pointer_cast<StringExpr>(node_ptr))
{
act(derived_ptr);
}
else if(std::shared_ptr<Expression> derived_ptr = std::dynamic_pointer_cast<Expression>(node_ptr))
{
// do something with generic expressions. Make sure that this is AFTER the more concrete if casts
}
else if( ... ) // more of this
{
}
// more else if
else
{
// default action or raise invalid argument exception or so...
}
};
This is especially annoying & error-prone since my class hierarchy has many (> 20) different concrete classes that can be instantiated. Also, I have various act-functions, and when I refactor things (e.g. add an act for an additional type), I have to make sure to pay attention to the correct order of if(dynamic_pointer_cast) within the dispatcher.
Also it's not that stable, since a change in the underlying class hierarchy will require me to change every dispatcher directly, rather than just the specific act functions.
Is there a better / smarter solution? Evidently I'd appreciate "native" solutions, but I'm willing to consider libraries too.

Never encountered such problem myself, but can think of the following solution.
Create you hierarchy that mimics original hierarchy, has virtual act, the base has base pointer, and each cast it to the corresponding derived pointer.
Now, to create the needed wrapper, you don't need properly ordered dynamic_cast, dispach on typeid string. So your dispatch is a map from string to wrapper factory.
Sure you need RTTI for typeid string, but you would need it for dynamic_cast as well.

C++ Derived class overriding member of base class with another derived class?

I have classes like this:
class ParkingLot
{
public:
int spaces;
virtual bool something() { return true; }
}
class ParkingLotBuilding
{
public:
ParkingLot Floor1, Floor2;
}
I've got a whole lot of functions that take ParkingLotBuilding. Let's say someone (me) derives from ParkingLot and ParkingLotBuilding:
class DerivedParkingLot : public ParkingLot
{
public:
virtual bool something() { return false; }
}
class DerivedParkingLotBuilding : public ParkingLotBuilding
{
public:
// how can I make it so that Floor1 and Floor2 are for DerivedParkingLot?
}
I've got functions I don't control that are like this:
CheckBuilding( ParkingLotBuilding &building )
{
if(building.Floor1.something() == true)
// error
}
If I pass a DerivedParkingLotBuilding object to that function how do I make it so that it calls DerivedParkingLot::something() to return false? Is that possible? Sorry if I didn't explain this right I'm not sure how to ask about the problem. Thanks

As JohnSmith pointed out, you can't override data members, just member functions. Since ParkingLotBuilding contains ParkingLot values, and not ParkingLot pointers or references, they can't be used polymorphically, even in DerivedParkingLot. (That's just how C++ works: only pointers and references can have a dynamic type.)
That means that if you can't change the ParkingLotBuilding class (or the CheckBuilding function), then you're stuck. There is no deriving you can do that will get the CheckBuilding function to operate on a DerivedParkingLot object.
The moral of this story is that classes must be designed for inheritance from the beginning.

In fact you just to call the DerivedParkingLot function from a ParkingLot instance ?
Your code already did it by specifiying the something method as virtual, it will automaticaly search for the lowest method in his inherited tree.
A simple way to test it is to implement the something method in ParkingLot and DerivedParkingLot, put different message in each and check it

One way you might be able to approach this is by making ParkingLot a template class.
template<typename T>
class ParkingLotBuilding
{
public:
T Floor1, Floor2;
}
Then when creating a ParkingLotBuilding, you could use these types:
ParkingLotBuilding<ParkingLot>
ParkingLotBuilding<DerivedParkingLot>
Also if you don't like templating all the time and want to just use ParkingLotBuilding and DerivedParkingLotBuilding, you could rename the class to something like Building and use typedefs:
typedef Building<ParkingLot> ParkingLotBuilding
typedef Building<DerivedParkingLot> DerivedParkingLotBuilding
This approach isn't exactly inheritance between the ParkingLotBuilding types (and may not be the best approach - I've never seen this before), but it might do what you need.

In your example, Floor1 has no way of knowing whether it was instantiated inside of ParkingLotBuilding or DerivedParkingLotBuilding.
You could use RTTI to deal with this something like:
CheckBuilding (ParkingLotBuilding *building)
{
if (dynamic_cast<DerivedParkingLogBuilding*>(building))
{
// Floor is in a derived parking log building
}
else
{
// Floor is in a parking lot building
}
}
Not exactly the best was of doing this though, as pointed out above.

Generic class variable of a certain type

In C# I can define this:
public interface BaseObject
{
int GetValue();
}
public class Test<T> where T : BaseClass
{
T BaseObject;
}
which means I know that I can alwaysa call BaseObject.GetValue() / BaseObject->GetValue(); because I know that the baseobject has this method.
Is there a similiar way to do this in C++? So that I can define an interface that multiple classes can inherit and a class that can take advantage of this.

Templates, which are even more powerful than C# generics (not to say they are necessarily better, just different).
template<class T>
class foo
{
public:
int whatever()
{
return obj.GetValue();
}
private:
T obj;
};
A separate class is created for each template argument you use. If you provide a template type which would result in an error you will know at compile time.

You're asking about C++ concepts, a way to specify requirements for template parameters. They were proposed during the work on C++11, but proved complicated enough that they weren't done in time. But they've just been delayed, not forgotten.
In the meantime, duck typing remains very powerful, and it will catch when you pass a template parameter that doesn't have the required interface. It just won't report the problem as neatly.
As a workaround, a simple way to check the constraint you showed takes advantage of the fact that pointer conversions are implicit only when upcasting:
public class Test<T> where T : BaseClass
{
static T* enforcement_helper = 0;
static BaseClass* enforce_inheritance_constraint = enforcement_helper;
};
Depending on how new your compiler is, you may need to put those lines inside a special member function (destructor is good, because it's almost always processed).
But you should only check constraints in order to improve error messages (by causing the failure in a clearly commented section of code). C++ templates are duck typed, and they will work with any template parameters that provide the required operations. No formal "interface" is required.

C++ design with static methods

I would like to define as class X with a static method:
class X
{
static string get_type () {return "X";}
//other virtual methods
}
I would like to force classes which inherit from X to redefine the get_type() method
and return strings different from "X" (I am happy if they just redefine get_type for now).
How do I do this? I know that I cannot have virtual static methods.
Edit: The question is not about the type_id, but in general about a static method that
should be overriden. For example
class X {
static int getid() {return 1;}
}

template<int id>
class X {
public:
static int getid() { return id; }
};
class Y : public X<2> {
};
You haven't overridden the method, but you've forced every subclass to provide an ID. Caveat: I haven't tried this, there might be some subtle reason why it wouldn't work.

If I'm not mistaken, to call the static method, you have to invoke the method by specifying the exact name of the class, e.g X::get_type();, DerivedClass::get_type() etc and in any case, if called on an object, the dynamic type of the object is not taken into account. So at least in the particular case, it will probably only be useful in a templated context when you are not expecting polymorphic behavior.
However, I don't see why it shouldn't be possible to force each interesting class (inherited or not, since "compile-time polymorphism" doesn't care) to provide this functionality with templates. In the following case, you must specialize the get_type function or you'll have a compile-time error:
#include <string>
struct X {};
struct Derived: X {};
template <class T> std::string get_type() {
static_assert(sizeof(T) == 0, "get_type not specialized for given type");
return std::string();
}
template <> std::string get_type<X>() {
return "X";
}
int main() {
get_type<X>();
get_type<Derived>(); //error
}
(static_assert is C++0x, otherwise use your favourite implementation, e.g BOOST_STATIC_ASSERT. And if you feel bad about specializing functions, specialize a struct instead. And if you want to force an error if someone accidentally tries to specialize it for types not derived from X, then that should also be possible with type_traits.)

I'd say you know the why but just in case here's a good explanation:
http://publib.boulder.ibm.com/infocenter/lnxpcomp/v8v101/index.jsp?topic=/com.ibm.xlcpp8l.doc/language/ref/cplr139.htm
It looks like your going to have to design your way out of this. Perhaps a virtual function that wraps a Singleton?

Don't do that, use typeid instead.

To make a long story short, you can't do it. The only way to require a derived class to override a base class function is to make it a pure virtual (which can't be static).

You can't do this for a number of reasons. You can't define the function in X and have it be pure virtual. You can't have virtual static functions at all.
Why must they be static?

Here you go
class X
{
static string get_type() {return "X"; }
};
class Y : public X
{
static string get_type() {return "Y"; }
};
The code above does exactly what you requested: the derived class redefines get_type and returns a different string. If this is not what you want, you have to explain why. You have to explain what is it you are trying to do and what behavior you expect from that static method. If is absolutely unclear form your original question.

You mention a few places about guaranteeing that the child types yield unique values for your function. This is, as others have said, impossible at compile time [at least, without the use of templates, which might or might not be acceptable]. But if you delay it until runtime, you can maybe pull something similar off.
class Base {
static std::vector<std::pair<const std::type_info*, int> > datas;
typedef std::vector<std::pair<const std::type_info*, int> >::iterator iterator;
public:
virtual ~Base() { }
int Data() const {
const std::type_info& info = typeid(*this);
for(iterator i = datas.begin(); i != datas.end(); ++i)
if(*(i->first) == info) return i->second;
throw "Unregistered Type";
}
static bool RegisterClass(const Base& p, int data) {
const std::type_info& info = typeid(p);
for(iterator i = datas.begin(); i != datas.end(); ++i) {
if(i->second == data) {
if(*(i->first) != info) throw "Duplicate Data";
return true;
}
if(*(i->first) == info) throw "Reregistering";
}
datas.push_back(std::make_pair(&info, data));
return true;
}
};
std::vector<std::pair<const std::type_info*, int> > Base::datas;
class Derived : public Base { };
const DerivedRegisterFlag = Base::RegisterClass(Derived(), 10);
class OtherDerived : public Base { };
const OtherDerivedRegisterFlag = Base::RegisterClass(OtherDerived(), 10); //exception
Caveats: This is completely untested. The exceptions would get thrown before entering main if you do it this way. You could move the registration into constructors, and accept the per-instance overhead of registration checking if you'd rather.
I chose an unordered vector for simplicity; I'm not sure if type_info::before provides the necessary semantics to be used as a predicate for a map, and presumably you won't have so many derived classes that a linear search would be problematic anyhow. I store a pointer because you can't copy type_info objects directly. This is mostly safe, since the lifetime of the object returned by typeid is the entire program. There might be issues when the program is shutting down, I'm not sure.
I made no attempt to protect against static order of initialization errors. As written, this will fail at some point.
Finally, no it isn't static, but "static" and "virtual" don't really make sense together anyhow. If you don't have an instance of the type to act on, then how do you know which overwritten method to chose? There are a few cases with templates where you might legitimately want to call a static method without an actual object, but that's not likely to be common.
*edit: Also, I'm not sure how this interacts with dynamically linked libraries and the like. My suspicion is that RTTI is unreliable in those situations, so obviously this is similarly unreliable.

Use Delphi, it supports virtual static members on classes. ;>

Apologies for resurrecting this thread, but I've just encountered this moral crisis as well. This is a very bold and possibly foolish statement to make, but I wholeheartedly disagree with what most people are saying about static virtual not making any sense. This dilemma stems from how static members are commonly used versus what they're actually doing underneath.
People often express facts using static classes and/or members - something that is true for all instances if instances are relevant, or simply facts about the world in the case of static classes. Suppose you're modelling a Philosophy class. You might define abstract class Theory to represent a theory which is to be taught, then inherit from Theory in TheoryOfSelf, TheoryOfMind and so on. To teach a Theory, you'd really want a method called express() which expresses a theory using a particular turn of phrase appropriate to the audience. One would assume that any inheriting class should expose an identical method express(). If I were able to, I would model this relationship using static virtual Theory.express() - it is both a statement of fact transcending the concept of instances (therefore static) and nonspecific, requiring a specific implementation by each type of theory (therefore virtual).
I completely agree however with people justifying the prohibition on the grounds of what static is actually doing - it makes perfect sense in terms of coding principles, the issue arises from the customary ways people commonly model the real world.
The best resolution to this problem I've been able to think of is to model Theory as a singleton instead - there may be an instance of a theory, but there's only ever one of them. If you want an alternative, it's a different type, so create a new derived class. To me this approach just seems arbitrary and introduces unnecessary noise.

Concrete class specific methods

I have an interesting problem. Consider this class hierachy:
class Base
{
public:
virtual float GetMember( void ) const =0;
virtual void SetMember( float p ) =0;
};
class ConcreteFoo : public Base
{
public:
ConcreteFoo( "foo specific stuff here" );
virtual float GetMember( void ) const;
virtual void SetMember( float p );
// the problem
void foo_specific_method( "arbitrary parameters" );
};
Base* DynamicFactory::NewBase( std::string drawable_name );
// it would be used like this
Base* foo = dynamic_factory.NewBase("foo");
I've left out the DynamicFactory definition and how Builders are
registered with it. The Builder objects are associated with a name
and will allocate a concrete implementation of Base. The actual
implementation is a bit more complex with shared_ptr to handle memory
reclaimation, but they are not important to my problem.
ConcreteFoo has class specific method. But since the concrete instances
are create in the dynamic factory the concrete classes are not known or
accessible, they may only be declared in a source file. How can I
expose foo_specific_method to users of Base*?
I'm adding the solutions I've come up with as answers. I've named
them so you can easily reference them in your answers.
I'm not just looking for opinions on my original solutions, new ones
would be appreciated.

The cast would be faster than most other solutions, however:
in Base Class add:
void passthru( const string &concreteClassName, const string &functionname, vector<string*> args )
{
if( concreteClassName == className )
runPassThru( functionname, args );
}
private:
string className;
map<string, int> funcmap;
virtual void runPassThru( const string &functionname, vector<string*> args ) {}
in each derived class:
void runPassThru( const string &functionname, vector<string*> args )
{
switch( funcmap.get( functionname ))
{
case 1:
//verify args
// call function
break;
// etc..
}
}
// call in constructor
void registerFunctions()
{
funcmap.put( "functionName", id );
//etc.
}

The CrazyMetaType solution.
This solution is not well thought out. I was hoping someone might
have had experience with something similar. I saw this applied to the
problem of an unknown number of a known type. It was pretty slick. I
was thinking to apply it to an unkown number of unknown type***S***
The basic idea is the CrazyMetaType collects the parameters is type
safe way, then executing the concrete specific method.
class Base
{
...
virtual CrazyMetaType concrete_specific( int kind ) =0;
};
// used like this
foo->concrete_specific(foo_method_id) << "foo specific" << foo_specific;
My one worry with this solution is that CrazyMetaType is going to be
insanely complex to get this to work. I'm up to the task, but I
cannot count on future users to be up to be c++ experts just to add
one concrete specific method.

Add special functions to Base.
The simplest and most unacceptable solution is to add
foo_specific_method to Base. Then classes that don't
use it can just define it to be empty. This doesn't work because
users are allowed to registers their own Builders with the
dynamic_factory. The new classes may also have concrete class
specific methods.
In the spirit of this solution, is one slightly better. Add generic
functions to Base.
class Base
{
...
/// \return true if 'kind' supported
virtual bool concrete_specific( int kind, "foo specific parameters" );
};
The problem here is there maybe quite a few overloads of
concrete_specific for different parameter sets.

Just cast it.
When a foo specific method is needed, generally you know that the
Base* is actually a ConcreteFoo. So just ensure the definition of class
ConcreteFoo is accessible and:
ConcreteFoo* foo2 = dynamic_cast<ConcreteFoo*>(foo);
One of the reasons I don't like this solution is dynamic_casts are slow and
require RTTI.
The next step from this is to avoid dynamic_cast.
ConcreteFoo* foo_cast( Base* d )
{
if( d->id() == the_foo_id )
{
return static_cast<ConcreteFoo*>(d);
}
throw std::runtime_error("you're screwed");
}
This requires one more method in the Base class which is completely
acceptable, but it requires the id's be managed. That gets difficult
when users can register their own Builders with the dynamic factory.
I'm not too fond of any of the casting solutions as it requires the
user classes to be defined where the specialized methods are used.
But maybe I'm just being a scope nazi.

The cstdarg solution.
Bjarn Stroustrup said:
A well defined program needs at most few functions for which the
argument types are not completely specified. Overloaded functions and
functions using default arguments can be used to take care of type
checking in most cases when one would otherwise consider leaving
argument types unspecified. Only when both the number of arguments and
the type of arguments vary is the ellipsis necessary
class Base
{
...
/// \return true if 'kind' supported
virtual bool concrete_specific( int kind, ... ) =0;
};
The disadvantages here are:
almost no one knows how to use cstdarg correctly
it doesn't feel very c++-y
it's not typesafe.

Could you create other non-concrete subclasses of Base and then use multiple factory methods in DynamicFactory?
Your goal seems to be to subvert the point of subclassing. I'm really curious to know what you're doing that requires this approach.

If the concrete object has a class-specific method then it implies that you'd only be calling that method specifically when you're dealing with an instance of that class and not when you're dealing with the generic base class. Is this coming about b/c you're running a switch statement which is checking for object type?
I'd approach this from a different angle, using the "unacceptable" first solution but with no parameters, with the concrete objects having member variables that would store its state. Though i guess this would force you have a member associative array as part of the base class to avoid casting to set the state in the first place.
You might also want to try out the Decorator pattern.

You could do something akin to the CrazyMetaType or the cstdarg argument but simple and C++-ish. (Maybe this could be SaneMetaType.) Just define a base class for arguments to concrete_specific, and make people derive specific argument types from that. Something like
class ConcreteSpecificArgumentBase;
class Base
{
...
virtual void concrete_specific( ConcreteSpecificArgumentBase &argument ) =0;
};
Of course, you're going to need RTTI to sort things out inside each version of concrete_specific. But if ConcreteSpecificArgumentBase is well-designed, at least it will make calling concrete_specific fairly straightforward.

The weird part is that the users of your DynamicFactory receive a Base type, but needs to do specific stuff when it is a ConcreteFoo.
Maybe a factory should not be used.
Try to look at other dependency injection mechanisms like creating the ConcreteFoo yourself, pass a ConcreteFoo type pointer to those who need it, and a Base type pointer to the others.

The context seems to assume that the user will be working with your ConcreteType and know it is doing so.
In that case, it seems that you could have another method in your factory that returns ConcreteType*, if clients know they're dealing with concrete type and need to work at that level of abstraction.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Declaration Derivation in Language Parser - c++

Related

C++ Dynamic Dispatch Function

C++ Derived class overriding member of base class with another derived class?

Generic class variable of a certain type

C++ design with static methods

Concrete class specific methods

Categories

Resources