I am making a toy programming language in c++, but i have run into a problem. I have noticed that in c++ a stack can only store one type of data. I was wondering if there was an easy way to fix this problem, such as by storing in the stack a byte array of each object. I was wondering if anyone knows how the jvm overcomes this issue. The types i would need to store on the stack would be char, short, int, float, double, strings, arrays, and references to objects. I understand that the jvm stack might be more of an abstraction, but if it is i would still like to know how they have accomplished it. If it makes any difference, i am only planning to target windows computers.
You know C++ has support for inheritance and polymorphism, right? A far easier way to do this is to derive all your tokens from a common base class, and make a stack of Base * objects, for instance:
#include <iostream>
#include <string>
#include <stack>
#include <memory>
class base {
public:
virtual void print_token() = 0;
virtual ~base() {}
};
class token_a : public base {
public:
token_a(int n) : n(n) {}
virtual void print_token() { std::cout << n << std::endl; }
private:
int n;
};
class token_b : public base {
public:
token_b(std::string s) : s(s) {}
virtual void print_token() { std::cout << s << std::endl; }
private:
std::string s;
};
int main(void) {
std::stack<std::shared_ptr<base> > my_stack;
my_stack.push(std::shared_ptr<base>(new token_a(5)));
my_stack.push(std::shared_ptr<base>(new token_b("a word")));
for ( int i = 0; i < 2; ++i ) {
std::shared_ptr<base> pb = my_stack.top();
pb->print_token();
my_stack.pop();
}
return 0;
}
outputs:
paul#local:~/src/cpp/scratch$ ./stack
a word
5
paul#local:~/src/cpp/scratch$
The way I have solved this problem (in C, for a lisp interpretr, about 25 years ago, but same idea applies today) is to have a struct with a type and a union inside it:
struct Data // or class
{
enum kind { floatkind, intkind, stringkind, refkind };
Kind kind;
union
{
double f;
int i;
std::string s;
Data* r; // reference, can't use Data &r without heavy trickery.
} u;
Data(double d) { kind = floatkind; u.f = d; }
Data(int i) { kind = intkind; u.i = i; }
...
}
std::stack<Data> st;
st.push(Data(42));
st.push(Data(3.14));
Just a guess, but the jvm probably treats everything as an object, so the stack is simply a collection of objects.
You can do the same, if you create a base data object class and derive all your supported data types from it.
Related
I am trying to study static polymophism and I implemented the following code. Thanks to the comments from StackOverflow members, I came to understand that what I just wrote is not static polymophism, but actually template-based policy-pattern.
Can anyone give any insight about how to turn this piece of code into static polymophism?
#include <iostream>
template<typename T>
class Interface {
T ex;
public:
double getData() {
return ex.getData(0);
}
};
class Extractor1 {
public:
double getData(const int a) {
return 1;
}
};
class Extractor2 {
public:
double getData(const int a) {
return 2;
}
};
int main() {
// here is the problem: the following 2 variables belong to different types. Therefore, I cannot create an array of pointers which point to the base class
Interface<Extractor1> e1;
Interface<Extractor2> e2;
std::cout<<"FE1 "<< e1.getData() <<" FE2 "<< e2.getData()<<std::endl;
return 0;
}
You can change your code like this to achieve static polymorphism:
#include <iostream>
template <typename T>
class Interface {
public:
double getData(int a) {
return static_cast<T *>(this)->getData(a);
}
};
class Extractor1 : public Interface<Extractor1> {
public:
double getData(int a) {
return 1;
}
};
class Extractor2 : public Interface<Extractor2> {
public:
double getData(int a) {
return 2;
}
};
int main() {
Interface<Extractor1> e1;
Interface<Extractor2> e2;
std::cout << e1.getData(1) << " " << e2.getData(2) << std::endl;
}
The advantage of using static polymorphism is you avoid paying the runtime cost of a vtable lookup like you would when using virtual functions. The drawback though, as I see you are running into based on your 'array' comment, is that you cannot place these different Extractor classes into an array or any other container, because they are both inheriting different base types. The only way around this, aside from using something like a tuple or a container filled with boost::any types, is creating a common base class for your Extractor classes.
Consider the following code snippet:
struct Base { virtual void func() { } };
struct Derived1 : Base { void func() override { print("1"); } };
struct Derived2 : Base { void func() override { print("2"); } };
class Manager {
std::vector<std::unique_ptr<Base>> items;
public:
template<class T> void add() { items.emplace_back(new T); }
void funcAll() { for(auto& i : items) i->func(); }
};
int main() {
Manager m;
m.add<Derived1>();
m.add<Derived2>();
m.funcAll(); // prints "1" and "2"
};
I'm using virtual dispatch in order to call the correct override method from a std::vector of polymorphic objects.
However, I know what type the polymorphic objects are, since I specify that in Manager::add<T>.
My idea was to avoid a virtual call by taking the address of the member function T::func() and directly storing it somewhere. However that's impossible, since I would need to store it as void* and cast it back in Manager::funcAll(), but I do not have type information at that moment.
My question is: it seems that in this situation I have more information than usual for polymorphism (the user specifies the derived type T in Manager::add<T>) - is there any way I can use this type information to prevent a seemingly unneeded virtual call? (An user should be able to create its own classes that derive from Base in its code, however.)
However, I know what type the polymorphic objects are, since I specify that in Manager::add<T>.
No you don't. Within add you know the type of the object that's being added; but you can add objects of different types, as you do in your example. There's no way for funcAll to statically determine the types of the elements unless you parametrise Manager to only handle one type.
If you did know the type, then you could call the function non-virtually:
i->T::func();
But, to reiterate, you can't determine the type statically here.
If I understand well, you want your add method, which is getting the class of the object, to store the right function in your vector depending on that object class.
Your vector just contains functions, no more information about the objects.
You kind of want to "solve" the virtual call before it is invoked.
This is maybe interesting in the following case: the function is then called a lot of times, because you don't have the overhead of solving the virtual each time.
So you may want to use a similar process than what "virtual" does, using a "virtual table".
The implementation of virtual is done at low level, so pretty fast compared to whatever you will come up with, so again, the functions should be invoked a LOT of times before it gets interesting.
One trick that can sometimes help in this kind of situation is to sort the vector by type (you should be able to use the knowledge of the type available in the add() function to enforce this) if the order of elements doesn't otherwise matter. If you are mostly going to be iterating over the vector in order calling a virtual function this will help the CPU's branch predictor predict the target of the call. Alternatively you can maintain separate vectors for each type in your manager and iterate over them in turn which has a similar effect.
Your compiler's optimizer can also help you with this kind of code, particularly if it supports Profile Guided Optimization (POGO). Compilers can de-virtualize calls in certain situations, or with POGO can do things in the generated assembly to help the CPU's branch predictor, like test for the most common types and perform a direct call for those with a fallback to an indirect call for the less common types.
Here's the results of a test program that illustrates the performance benefits of sorting by type, Manager is your version, Manager2 maintains a hash table of vectors indexed by typeid:
Derived1::count = 50043000, Derived2::count = 49957000
class Manager::funcAll took 714ms
Derived1::count = 50043000, Derived2::count = 49957000
class Manager2::funcAll took 274ms
Derived1::count = 50043000, Derived2::count = 49957000
class Manager2::funcAll took 273ms
Derived1::count = 50043000, Derived2::count = 49957000
class Manager::funcAll took 714ms
Test code:
#include <iostream>
#include <vector>
#include <memory>
#include <random>
#include <unordered_map>
#include <typeindex>
#include <chrono>
using namespace std;
using namespace std::chrono;
static const int instanceCount = 100000;
static const int funcAllIterations = 1000;
static const int numTypes = 2;
struct Base { virtual void func() = 0; };
struct Derived1 : Base { static int count; void func() override { ++count; } };
int Derived1::count = 0;
struct Derived2 : Base { static int count; void func() override { ++count; } };
int Derived2::count = 0;
class Manager {
vector<unique_ptr<Base>> items;
public:
template<class T> void add() { items.emplace_back(new T); }
void funcAll() { for (auto& i : items) i->func(); }
};
class Manager2 {
unordered_map<type_index, vector<unique_ptr<Base>>> items;
public:
template<class T> void add() { items[type_index(typeid(T))].push_back(make_unique<T>()); }
void funcAll() {
for (const auto& type : items) {
for (auto& i : type.second) {
i->func();
}
}
}
};
template<typename Man>
void Test() {
mt19937 engine;
uniform_int_distribution<int> d(0, numTypes - 1);
Derived1::count = 0;
Derived2::count = 0;
Man man;
for (auto i = 0; i < instanceCount; ++i) {
switch (d(engine)) {
case 0: man.add<Derived1>(); break;
case 1: man.add<Derived2>(); break;
}
}
auto startTime = high_resolution_clock::now();
for (auto i = 0; i < funcAllIterations; ++i) {
man.funcAll();
}
auto endTime = high_resolution_clock::now();
cout << "Derived1::count = " << Derived1::count << ", Derived2::count = " << Derived2::count << "\n"
<< typeid(Man).name() << "::funcAll took " << duration_cast<milliseconds>(endTime - startTime).count() << "ms" << endl;
}
int main() {
Test<Manager>();
Test<Manager2>();
Test<Manager2>();
Test<Manager>();
}
In C++, the T q = dynamic_cast<T>(p); construction performs a runtime cast of a pointer p to some other pointer type T that must appear in the inheritance hierarchy of the dynamic type of *p in order to succeed. That is all fine and well.
However, it is also possible to perform dynamic_cast<void*>(p), which will simply return a pointer to the "most derived object" (see 5.2.7::7 in C++11). I understand that this feature probably comes out for free in the implementation of the dynamic cast, but is it useful in practice? After all, its return type is at best void*, so what good is this?
The dynamic_cast<void*>() can indeed be used to check for identity, even if dealing with multiple inheritance.
Try this code:
#include <iostream>
class B {
public:
virtual ~B() {}
};
class D1 : public B {
};
class D2 : public B {
};
class DD : public D1, public D2 {
};
namespace {
bool eq(B* b1, B* b2) {
return b1 == b2;
}
bool eqdc(B* b1, B *b2) {
return dynamic_cast<void*>(b1) == dynamic_cast<void*>(b2);
}
};
int
main() {
DD *dd = new DD();
D1 *d1 = dynamic_cast<D1*>(dd);
D2 *d2 = dynamic_cast<D2*>(dd);
std::cout << "eq: " << eq(d1, d2) << ", eqdc: " << eqdc(d1, d2) << "\n";
return 0;
}
Output:
eq: 0, eqdc: 1
Bear in mind that C++ lets you do things the old C way.
Suppose I have some API in which I'm forced to smuggle an object pointer through the type void*, but where the callback it's eventually passed to will know its dynamic type:
struct BaseClass {
typedef void(*callback_type)(void*);
virtual callback_type get_callback(void) = 0;
virtual ~BaseClass() {}
};
struct ActualType: BaseClass {
callback_type get_callback(void) { return my_callback; }
static void my_callback(void *p) {
ActualType *self = static_cast<ActualType*>(p);
...
}
};
void register_callback(BaseClass *p) {
// service.register_listener(p->get_callback(), p); // WRONG!
service.register_listener(p->get_callback(), dynamic_cast<void*>(p));
}
The WRONG! code is wrong because it fails in the presence of multiple inheritance (and isn't guaranteed to work in the absence, either).
Of course, the API isn't very C++-style, and even the "right" code can go wrong if I inherit from ActualType. So I wouldn't claim that this is a brilliant use of dynamic_cast<void*>, but it's a use.
Casting pointers to void* has its importance since way back in C days.
Most suitable place is inside the memory manager of Operating System. It has to store all the pointer and the object of what you create. By storing it in void* they generalize it to store any object on to the memory manager data structure which could be heap/B+Tree or simple arraylist.
For simplicity take example of creating a list of generic items(List contains items of completely different classes). That would be possible only using void*.
standard says that dynamic_cast should return null for illegal type casting and standard also guarantees that any pointer should be able to type cast it to void* and back from it with only exception of function pointers.
Normal application level practical usage is very less for void* typecasting but it is used extensively in low level/embedded systems.
Normally you would want to use reinterpret_cast for low level stuff, like in 8086 it is used to offset pointer of same base to get the address but not restricted to this.
Edit:
Standard says that you can convert any pointer to void* even with dynamic_cast<> but it no where states that you can not convert the void* back to the object.
For most usage, its a one way street but there are some unavoidable usage.
It just says that dynamic_cast<> needs type information for converting it back to the requested type.
There are many API's that require you to pass void* to some object eg. java/Jni Code passes the object as void*.
Without type info you cannot do the casting.If you are confident enough that type requested is correct you can ask compiler to do the dynmaic_cast<> with a trick.
Look at this code:
class Base_Class {public : virtual void dummy() { cout<<"Base\n";} };
class Derived_Class: public Base_Class { int a; public: void dummy() { cout<<"Derived\n";} };
class MostDerivedObject : public Derived_Class {int b; public: void dummy() { cout<<"Most\n";} };
class AnotherMostDerivedObject : public Derived_Class {int c; public: void dummy() { cout<<"AnotherMost\n";} };
int main () {
try {
Base_Class * ptr_a = new Derived_Class;
Base_Class * ptr_b = new MostDerivedObject;
Derived_Class * ptr_c,*ptr_d;
ptr_c = dynamic_cast< Derived_Class *>(ptr_a);
ptr_d = dynamic_cast< Derived_Class *>(ptr_b);
void* testDerived = dynamic_cast<void*>(ptr_c);
void* testMost = dynamic_cast<void*>(ptr_d);
Base_Class* tptrDerived = dynamic_cast<Derived_Class*>(static_cast<Base_Class*>(testDerived));
tptrDerived->dummy();
Base_Class* tptrMost = dynamic_cast<Derived_Class*>(static_cast<Base_Class*>(testMost));
tptrMost->dummy();
//tptrMost = dynamic_cast<AnotherMostDerivedObject*>(static_cast<Base_Class*>(testMost));
//tptrMost->dummy(); //fails
} catch (exception& my_ex) {cout << "Exception: " << my_ex.what();}
system("pause");
return 0;
}
Please correct me if this is not correct in any way.
it is usefull when we put the storage back to memory pool but we only keep a pointer to the base class. This case we should figure out the original address.
Expanding on #BruceAdi's answer and inspired by this discussion, here's a polymorphic situation which may require pointer adjustment. Suppose we have this factory-type setup:
struct Base { virtual ~Base() = default; /* ... */ };
struct Derived : Base { /* ... */ };
template <typename ...Args>
Base * Factory(Args &&... args)
{
return ::new Derived(std::forward<Args>(args)...);
}
template <typename ...Args>
Base * InplaceFactory(void * location, Args &&... args)
{
return ::new (location) Derived(std::forward<Args>(args)...);
}
Now I could say:
Base * p = Factory();
But how would I clean this up manually? I need the actual memory address to call ::operator delete:
void * addr = dynamic_cast<void*>(p);
p->~Base(); // OK thanks to virtual destructor
// ::operator delete(p); // Error, wrong address!
::operator delete(addr); // OK
Or I could re-use the memory:
void * addr = dynamic_cast<void*>(p);
p->~Base();
p = InplaceFactory(addr, "some", "arguments");
delete p; // OK now
Don't do that at home
struct Base {
virtual ~Base ();
};
struct D : Base {};
Base *create () {
D *p = new D;
return p;
}
void *destroy1 (Base *b) {
void *p = dynamic_cast<void*> (b);
b->~Base ();
return p;
}
void destroy2 (void *p) {
operator delete (p);
}
int i = (destroy2 (destroy1 (create ())), i);
Warning: This will not work if D is defined as:
struct D : Base {
void* operator new (size_t);
void operator delete (void*);
};
and there is no way to make it work.
This might be one way to provide an Opaque Pointer through an ABI. Opaque Pointers -- and, more generally, Opaque Data Types -- are used to pass objects and other resources around between library code and client code in such a way that the client code can be isolated from the implementation details of the library. There are other ways to accomplish this, to be sure, and maybe some of them would be better for a particular use case.
Windows makes a lot of use of Opaque Pointers in its API. HANDLE is, I believe, generally an opaque pointer to the actual resource you have a HANDLE to, for example. HANDLEs can be Kernel Objects like files, GDI objects, and all sorts of User Objects of various kinds -- all of which must be vastly different in implementation, but all are returned as a HANDLE to the user.
#include <iostream>
#include <string>
#include <iomanip>
using namespace std;
/*** LIBRARY.H ***/
namespace lib
{
typedef void* MYHANDLE;
void ShowObject(MYHANDLE h);
MYHANDLE CreateObject();
void DestroyObject(MYHANDLE);
};
/*** CLIENT CODE ***/
int main()
{
for( int i = 0; i < 25; ++i )
{
cout << "[" << setw(2) << i << "] :";
lib::MYHANDLE h = lib::CreateObject();
lib::ShowObject(h);
lib::DestroyObject(h);
cout << "\n";
}
}
/*** LIBRARY.CPP ***/
namespace impl
{
class Base { public: virtual ~Base() { cout << "[~Base]"; } };
class Foo : public Base { public: virtual ~Foo() { cout << "[~Foo]"; } };
class Bar : public Base { public: virtual ~Bar() { cout << "[~Bar]"; } };
};
lib::MYHANDLE lib::CreateObject()
{
static bool init = false;
if( !init )
{
srand((unsigned)time(0));
init = true;
}
if( rand() % 2 )
return static_cast<impl::Base*>(new impl::Foo);
else
return static_cast<impl::Base*>(new impl::Bar);
}
void lib::DestroyObject(lib::MYHANDLE h)
{
delete static_cast<impl::Base*>(h);
}
void lib::ShowObject(lib::MYHANDLE h)
{
impl::Foo* foo = dynamic_cast<impl::Foo*>(static_cast<impl::Base*>(h));
impl::Bar* bar = dynamic_cast<impl::Bar*>(static_cast<impl::Base*>(h));
if( foo )
cout << "FOO";
if( bar )
cout << "BAR";
}
For a constructor with multiple arguments...
For example:
class C {
public:
C(int a=1, int b=2){ cout << a << ", " << b << "\n"; }
}
int main(){
C a(10), b = 20;
}
output:
10, 2
20, 2
How do I just assign value to the 2nd parameter? So that I can get "1, 20" without knowing the default values? Or is that that I must always assign value to the argument that precedes before I can use the arguments behind?
And how do I implicitly assign all the parameters? If I can't do that, why? For the above example (as I am new to C++), I once thought I would get "10, 20" as output instead.
Or is that that I must always assign value to the argument that precedes before I can use the arguments behind?
Yes. Otherwise, how is the compiler supposed to know which argument should be used for which parameter?
However, there are ways to accomplish this. For example,
struct C {
enum { DefaultA = 1, DefaultB = 2 };
C(int a = DefaultA, int b = DefaultB) { /* ... */ }
};
C object(C::DefaultA, 20);
Or, if you have a lot of parameters with different "defaults:"
struct CParams {
int a, b;
CParams() : a(1), b(2) { }
};
struct C {
C(CParams x) { /* ... */ }
};
CParams params;
params.b = 20;
C object(params);
C++ doesn't support named arguments. You have to specify the first one.
Also, the variable name b from the main function is completely separate from the b in the constructor definition. There's no relationship whatsoever implied by the naming.
I had the same thought (Convienient C++ struct initialisation -- perhaps you find something you like better there) some time ago, but just now, reading your question, I thought of a way to actually accomplish this. But it is quite some extra code, so the question remains if it is actually worth it. I just implemented it very sketchy and I am not proud of my choice of names (I usually don't use _ but it's late). Anyway, this is how you can do it:
#include <iostream>
struct C_members {
int a;
int b;
C_members(int _a, int _b) : a(_a), b(_b) {}
};
class C_init {
public:
virtual C_members get(C_members init) const {
return init;
}
};
class C_a : public C_init {
private:
int a;
public:
C_a(int _a) : a(_a) {}
C_members get(C_members init) const {
init.a = a;
return init;
}
};
class C_b : public C_init {
private:
int b;
public:
C_b(int _b) : b(_b) {}
C_members get(C_members init) const {
init.b = b;
return init;
}
};
class C : private C_members {
private:
static const C_members def;
public:
C(C_init const& ai = C_init(), C_init const& bi = C_init()) : C_members(ai.get(bi.get(def)).a, bi.get(ai.get(def)).b) {
std::cout << a << "," << b << std::endl;
}
};
const C_members C::def(1,2); // default values
// usage:
int main() {
C c1(C_b(77)); // 1,77
C c2(C_a(12)); // 12,2
C c3(C_b(5),C_a(6)); // 6,5
return 0;
}
There is a lot of stuff that can be improved (with templates (for code reduction) and with const refs in the get method), but you get the idea.
As a bonus feature, you almost have the pimpl idiom implemented (very little effort is necessary to extend this to an actual pimpl design).
Usually in OOP, every object instance holds (and represents) a state.
So the best way is to define an accessor functions such as
void setB(int newBvalue);
and also to hold b as a private member.
if "b" is shared among all the instances of the same object, consider to save a static variable.
I need to bind a method into a function-callback, except this snippet is not legal as discussed in demote-boostfunction-to-a-plain-function-pointer.
What's the simplest way to get this behavior?
struct C {
void m(int x) {
(void) x;
_asm int 3;
}};
typedef void (*cb_t)(int);
int main() {
C c;
boost::function<void (int x)> cb = boost::bind(&C::m, &c, _1);
cb_t raw_cb = *cb.target<cb_t>(); //null dereference
raw_cb(1);
return 0;
}
You can make your own class to do the same thing as the boost bind function. All the class has to do is accept the function type and a pointer to the object that contains the function. For example, this is a void return and void param delegate:
template<typename owner>
class VoidDelegate : public IDelegate
{
public:
VoidDelegate(void (owner::*aFunc)(void), owner* aOwner)
{
mFunction = aFunc;
mOwner = aOwner;
}
~VoidDelegate(void)
{}
void Invoke(void)
{
if(mFunction != 0)
{
(mOwner->*mFunction)();
}
}
private:
void (owner::*mFunction)(void);
owner* mOwner;
};
Usage:
class C
{
void CallMe(void)
{
std::cout << "called";
}
};
int main(int aArgc, char** aArgv)
{
C c;
VoidDelegate<C> delegate(&C::CallMe, &c);
delegate.Invoke();
}
Now, since VoidDelegate<C> is a type, having a collection of these might not be practical, because what if the list was to contain functions of class B too? It couldn't.
This is where polymorphism comes into play. You can create an interface IDelegate, which has a function Invoke:
class IDelegate
{
virtual ~IDelegate(void) { }
virtual void Invoke(void) = 0;
}
If VoidDelegate<T> implements IDelegate you could have a collection of IDelegates and therefore have callbacks to methods in different class types.
Either you can shove that bound parameter into a global variable and create a static function that can pick up the value and call the function on it, or you're going to have to generate per-instance functions on the fly - this will involve some kind of on the fly code-gen to generate a stub function on the heap that has a static local variable set to the value you want, and then calls the function on it.
The first way is simple and easy to understand, but not at all thread-safe or reentrant. The second version is messy and difficult, but thread-safe and reentrant if done right.
Edit: I just found out that ATL uses the code generation technique to do exactly this - they generate thunks on the fly that set up the this pointer and other data and then jump to the call back function. Here's a CodeProject article that explains how that works and might give you an idea of how to do it yourself. Particularly look at the last sample (Program 77).
Note that since the article was written DEP has come into existance and you'll need to use VirtualAlloc with PAGE_EXECUTE_READWRITE to get a chunk of memory where you can allocate your thunks and execute them.
#include <iostream>
typedef void(*callback_t)(int);
template< typename Class, void (Class::*Method_Pointer)(void) >
void wrapper( int class_pointer )
{
Class * const self = (Class*)(void*)class_pointer;
(self->*Method_Pointer)();
}
class A
{
public:
int m_i;
void callback( )
{ std::cout << "callback: " << m_i << std::endl; }
};
int main()
{
A a = { 10 };
callback_t cb = &wrapper<A,&A::callback>;
cb( (int)(void*)&a);
}
i have it working right now by turning C into a singleton, factoring C::m into C::m_Impl, and declaring static C::m(int) which forwards to the singleton instance. talk about a hack.