Use of singletons in compiler - c++

I am writing a compiler for a C++-like language. I have to deal with symbol table which is represented in my code with class that has only static data and methods.
It's like:
class GlobalTable
{
/* static members */
static map<int, Symbol*> symbol_by_id;
static map<Symbol*, int> id_by_symbol;
/* some static methods */
};
Also is have a class which represents config:
class GlobalConfig
{
static const int int_size;
/* and so on ... */
};
I need to access these from a lot of places. Passing it around will result in swelling of code.
Is it convenient to use my class like that, or there is a better way to organize everything?

In a compiler, usually you store a symbol type in a symbol table, for some definition of symbol. Generally, I have a symbol table per scope, so the global scope in each module or compilable unit has its own symbol table.
You'll also have AST nodes that have symbols and symbol subtpes as their members.
Most internal compiler apis deal with symbol and ast subtypes, and those are stored in a indexed, fast access data structure. All of this has to be available to most compiler functions, so you either use a global compiler or context variable, or pass one around as a paramter.
Strictly, there is no need to use singletons. I can only talk in examples, so the way I do it is with a Compiler class that has instance members like currentScope and globalScope as well as the root AST or currentCompilableUnit, so the parser main instantiates a Compiler, and everything is reentrant. It is a bit more convenient, though, if your compiler instance is a global variable, so your functions can omit the compiler parameter, but besides that, not really any need for any others.
In short, it is typical for most APIs to include a "scope" or "compiler" struct or class in many of its signatures. Though if you use OOP to model the compiler, any methods of the class have implicit "this" access.

Related

A c++ class include a static member with the same type of itself. Why this pattern?

I inherited a project from a former colleague, and I found these code snippets (and some similar ones in SO questions: can a c++ class include itself as an member and static member object of a class in the same class)
// Service.h
class Service
{
// ...
public:
static Service sInstance;
void Somememberfunc();
//...
};
// Service.cpp
#include "Service.h"
Service Service::sInstance;
void Service::Somememberfunc()
{
//...
}
// Main.cpp
#include "Service.h"
void Fun()
{
Service &instance = Service::sInstance;
//...
instance.Somememberfunc();
//...
}
However, I did not find any explanation on when to use this pattern. And what are the advantages and disadvantages?
Notice that the member is a static, so it's part of the class, not of instantiated objects. This is important, because otherwise you would be trying to make a recursive member (since the member is part of the object, it also contains the same member and so on...), but this is not the case here.
The best pattern to describe this is: global variable. The static member is initialized before main() and can be accessed from any part of the program by including the header file. This is very convenient while implementing but becomes harder to handle the more complex the program gets and the longer you have to maintain it, so the general idea is to avoid this. Also, because there is no way to control the order of initialization, dependencies between different global variables can cause problems during startup.
Static member is roughly a global variable in the scope of the class.
Static members have also the advantage of visibility access (public/protected/private) to restreint its usage (file scope might be an alternative).
That member might be of type of the class.
Global are "easy" to (mis)use, as they don't require to think about architecture.
BUT (mutable) global are mostly discouraged as harder to reason about.
Acceptable usages IMO are for constants:
as for a matrix class, the null matrix, the diagonal one matrix.
for Temperature class, some specific value (absolute 0 (O Kelvin), temperature of water transformation(0 Celsius, 100 Celsius), ...)
in general NullObject, Default, ...
Singleton pattern. For example, you can use it to store your app configurations. And then you can easily access configurations anywhere(globally) within your app.
This is often used in the singleton design pattern
Visit https://en.wikipedia.org/wiki/Singleton_pattern

Calling member function without creating object in C++

Can anyone explain why we can call static member functions without creating instance of an object but we can't in case of non-static functions?
I searched everywhere, I couldn't find an explanation, Can you help?
You've got the logic basically in reverse. It is useful to have functions that belong to a class, even though they do not need to be called on an object of that class. Stroustrup didn't want to add a new keyword for that, so he repurposed the existing keyword static to distinguish such methods from normal methods.
In hindsight, other options could have been chosen. We could have made this an explicit function argument for the normal methods, for instance. But that's about 30 years too late now.
Why?
Because we like encapsulation and modularity which we get when using object orientated patterns.
You can view static class member functions and variables as a way of encapsulating what would otherwise have been global functions and variables - which might collide.
Underneath, there is not a great deal of difference between a plain old C++ file with some functions declared and implemented in it, to a class filled with only static functions. The difference is we can cleanly access a set of functions attached to a parent class.
An example:
Suppose you want to create a new class like MyMediumInteger, and you want developers to determine what the maximum number it can hold is. This information is applicable for every instance of MyMediumInteger, no matter what the state of the private member variables are. Therefore it makes sense to expose this information without forcing the developer to instantiate the class. Your options include defining something global such as #define MYMEDIUMINTEGER_MAX ... which could collide with a define sharing the same name in another module, or create a static function returning the max size, called neatly via
MyMediumInteger::maxSize().
A code example:
/***
* Use a static class member function (or variable)
*/
class MyMediumInteger
{
public:
static unsigned long maxSize() { return pow(2, 32) - 1; };
};
auto maxSize = MyMediumInteger::maxSize();
/**
* Alternative approaches.
* Note: These could all collide with other #defines or symbols declared/implemented in other modules.
* Note: These are both independant sources of information related to a class - wouldn't it be nicer if they could just belong to the class instead?
*/
/***
* Use a #define
*/
#define MYMEDIUMINTEGER_MAX (pow(2, 32) - 1)
auto maxSize = MYMEDIUMINTEGER_MAX;
/**
* Use a global function or variable
*/
static unsigned long getMyMediumIntegerMaxSize()
{
return pow(2, 32) - 1;
}
auto maxSize = getMyMediumIntegerMaxSize();
A word of warning:
Static member variables and functions have some of the pitfalls of global variables - because they persist through instances of classes, calling and assigning to them can cause unexpected side effects (because static member variables get changed for everyone, not just you).
A common pitfall is to add lots of static member functions and variables for management - for example, maintaining a list of all other class instances statically which gets appended to in the constructor of each class. Often this type of code could be refactored into a parent class whose job it is just to manage instances of the child class. The latter approach is far more testable and keeps things modular - for example, someone else could come along and write a different implementation of the management code without appending to or re-writing your child class implementation.
How?
In C++ class member functions are not actually stored in class instances, they are stored separately and called on instances of classes. Therefore, it's not hard to imagine how static functions fit into this - they are declared in the same way as normal class member functions, just with the static keyword to say "I don't need to be given a class instance to be run on". The consequence of this is it can't access class member variables or methods.
Static member functions does not need a "this" pointer, it belongs to the class.
Non-static functions need a "this" pointer to point out in which object it belong to, so you need to creat a object firstly.
Because that's what static means: "I don't want this function to require an object to work on".
That's the feature. That's its definition.
Any other answer would be circular.
Why is it useful? Sometimes we want to be able to "organise" some data or utilities in our classes, perhaps for use by external code or perhaps for use by non-static functions of the same class. In that sense there is some crossover with namespaces.

C++ Declaration of class variables in header or .cpp?

So far, I've been using classes the following way:
GameEngine.h declares the class as follows
class GameEngine {
public:
// Declaration of constructor and public methods
private:
InputManager inputManager;
int a, b, c;
// Declaration of private methods
};
My GameEngine.cpp files then just implement the methods
#include "____.h"
GameEngine::GameEngine() {
}
void GameEngine::run() {
// stuff
}
However, I've recently read that variable declarations are not supposed to be in the header file. In the above example, that would be an inputManager and a, b, c.
Now, I've been searching for where to put the variable declarations, the closest answer I found was this: Variable declaration in a header file
However, I'm not sure if the use of extern would make sense here; I just declare private variables that will only be used in an instance of the class itself. Are my variable declarations in the header files fine? Or should I put them elsewhere? If I should put them in the cpp file, do they go directly under the #include?
Don't confuse a type's members with variables. A class/struct definition is merely describing what constitutes a type, without actually declaring the existence of any variables, anything to be constructed on memory, anything addressable.
In the traditional sense, modern class design practices recommend you pretend they are "black boxes": stuff goes in, they can perform certain tasks, maybe output some other info. We do this with class methods all the time, briefly describing their signature on the .h/.hpp/.hxx file and hiding the implementation details in the .cpp/.cc/.cxx file.
While the same philosophy can be applied to members, the current state of C++, how translation units are compiled individually make this way harder to implement. There's certainly nothing "out of the box" that helps you here. The basic, fundamental problem is that for almost anything to use your class, it kind of needs to know the size in bytes, and this is something constrained by the member fields and the order of declaration. Even if they're private and nothing outside the scope of the type should be able to manipulate them, they still need to briefly know what they are.
If you actually want to hide this information to outsiders, certain idioms such as PImpl and inlined PImpl can help. But I'd recommend you don't go this way unless you're actually:
Writing a library with a semi-stable ABI, even if you make tons of changes.
Need to hide non-portable, platform-specific code.
Need to reduce pre-processor times due to an abundance of includes.
Need to reduce compile times directly impacted by this exposure of information.
What the guideline is actually talking about is to never declare global variables in headers. Any translation unit that takes advantage of your header, even if indirectly, will end up declaring its own global variable as per header instructions. Everything will compile just fine when examined individually, but the linker will complain that you have more than one definition for the same thing (which is a big no-no in C++)
If you need to reserve memory / construct something and bind it to a variable's name, always try to make that happen in the source file(s).
Class member variables must be declared in the class definition, which is usually in a header file. This should be done without any extern keywords, completely normally, like you have been doing so far.
Only variables that are not class members and that need to be declared in a header file should be declared extern.
As a general rule:
Variables that are to be used with many functions in the same class go in the class declaration.
Temporary variables for individual functions go in the functions themselves.
It seems that InputManager inputManager; belongs in the class header.
int a, b, c; is harder to know from here. What are they used for? They look like temporary variables that would be better off in the function(s) they're used in, but I can't say for sure without proper context.
extern has no use here.

Best practice for handling const class data

Say you have a certain class in which each instance of it needs to access the exact same set of data. It is more efficient to declare the data outside the class rather than have each instance make its own one, but doesn't this breach the 'no globals' rule?
Example code:
Foo.h
class Foo{
Foo();
void someFunction();
};
Foo.cpp
#include "Foo.h"
//surely ok since it's only global to the class's .cpp?
const int g_data[length] = {
//contains some data
};
Foo::someFunction(){
//do something involving g_data ..
}
..rather than making 'g_data' a member variable. Or is there some other way which avoids creating a global?
Use the modifier static which modifies the declaration of a class member so that is shared among all class instances. An example:
class A {
int length = 10;
static int g_data[length]; //Inside the class
}
And then you can access it like this:
int main(void) {
std::cout << "For example: " << A::g_data[2] << std::endl;
}
You can find more on this matter here
That's what a static member is for.
Thus you will have in your declaration :
class Foo{
Foo();
void someFunction();
static int const sharedData[length];
};
And somewhere in your cpp file :
int const Foo::sharedData[length] = { /* Your data */ };
Summarily, yes - your "global" is probably ok (though it'd be better in an anonymous namespace, and there are a few considerations).
Details: lots of answers recommending a static class member, and that is often a good way to go, but do keep in mind that:
a static class member must be listed in the class definition, so in your case it'll be in Foo.h, and any client code that includes that header is likely to want to recompile if you edit the static member in any way even if it's private and of no possible direct relevance to them (this is most important for classes in low-level libraries with diverse client code - enterprise libraries and those shared across the internet and used by large client apps)
with a static class member, code in the header has the option of using it (in which case a recompile will be necessary and appropriate if the static member changes)
if you put the data in the .cpp file, it's best in an anonymous namespace so it doesn't have external linkage (no other objects see it or can link to it), though you have no way to encapsulate it to protect it from access by other functions later in the .cpp's translation unit (whereas a non-public static class member has such protection)
What you really want is data belonging to the type, not to an instance of that type. Fortunately, in C++ there is an instrument doing exactly that — static class members.
If you want to avoid the global and have a more object-oriented solution, take a look at the Flyweight pattern. The Boost Flyweight library has some helper infrastructure and provides a decent explanation of the concept.
Since you are talking about efficiency, it may or not be a good idea to use such an approach, depending on your actual goal. Flyweights are more about saving memory and introduce an additional layer of indirection which may impact runtime performance. The externally stored state may impact compiler optimizations, especially inlining and reduce data locality which prevents efficient caching. On the other hand, some operations which like assignment or copying can be considerably faster because there is only the pointer to the shared state that needs to be copied (plus the non-shared state, but this has to be copied anyway). As always when it comes to efficiency / performance, make sure you measure your code to compare what suits your requirements.

Putting class static members definition into cpp file -- technical limitation?

one of my "favorite" annoyance when coding in C++ is declaring some static variable in my class and then looking at compilation error about unresolved static variable (in earlier times, I was always scared as hell what does it mean).
I mean classic example like:
Test.h
class Test
{
private:
static int m_staticVar;
int m_var;
}
Test.cpp
int Test::m_staticVar;
What makes it in my eyes even more confusing is the syntax of this definition, you can't use 'static' word here (as static has different meaning when used in cpp, sigh) so you have no idea (except the knowledge static member vars work like that) why on earth there's some int from Test class defined in this way and why m_var isn't.
To your knowledge / opinion, why is that? I can think of only one reason and that is making linker life easier -- i.e. for the same reason why you can't use non-integral constants (SomeClass m_var = something). But I don't like an idea of bending language features just because some part of compilation chain would have hard time eating it...
Well, this is just the way it works. You've only declared the static member in the .h file. The linker needs to be able to find exactly one definition of that static member in the object files it links together. You can't put the definition in the .h file, that would generate multiple definitions.
UPDATE: C++17 can solve this with an inline variable.
First, from compiler's point of view, this is perfectly reasonable. Why redundant keyword where it is not needed?
Second, I'd recommend against static members in C++. Before everybody jumps, I will try to explain.
Well, you are not going to have any public static data members (very rarely useful). In any case, most classes have their own CPP file. If so, a static global, IMO is preferable over a private static member for reasons of dependency reduction. Unlike non-static private data, the static ones are not part of the interface, and there's very little reason why should the user of the h file ever have to recompile, or at all see these members.