Does GC remove all objects after ns removal in Clojure? - clojure

I develop some application that it gets data from client and creates new ns for it.
After app does some manipulation in ns, calling functions etc.
Finally app returns some output and I remove the ns afterwards (remove-ns)
Does GC remove all data(objects) in that ns?
Another question is: Is to wise to create ns for each client?
I need to isolate clients from each other so there will be no conflict.(Concurrent users)

Unfortunately there isn't simple yes or no answer to your question.
What remove-ns[1] does is it calls the static remove method in clojure.lang.Namespace[2], which transactionally unmaps the symbol naming the namespace in the global mapping from ns names to Namespace objects. This makes it impossible to take a new reference to a namespace because namespaces and fully qualified symbols/vars are resolved through the namespace mapping, but it does NOT destroy the namespace or its contents.
If there exist no references to the unmapped Namespace or its contents, then yes it will (eventually) be garbage collected. However this assumes that all the Vars in that namespace never escape. If you ever require/refer Vars from one of these temporary namespaces into a long lived, you're creating a permanent link between the two namespaces which will result in the "temporary" namespace never being removed unless it is also ns-unmapped [3].
Modern JVMs use tracing garbage collectors, so objects will only be garbage collected if there is enough memory pressure to force a GC run, and there are no remaining uses of objects which are not themselves garbage. So for instance if you were compiling a function per-session in its own sandbox namespace, returning that function, calling it and then throwing it away and unmapping the temporary namespace that would probably work OK, because the only explicit reference to the temporary namespace is the returned closure (which is discarded) and the global namespace mapping which you remove. But exact behavior depends entirely on the structure of your application.
At an architectural level, it's safe to say you're doing it wrong. Compilation in Clojure (eval) is slow. You really shouldn't be dynamically generating functions let alone namespaces at runtime. By doing so, you're creating this architectural problem for yourself because you're (ab)using what's intended to be a global binding structure (Namespaces and Vars) to hold temporary bindings which you then have to worry about cleaning up.
If you really needed dynamic bindings or stack local Vars, there are structures for creating such short-lived context. Otherwise you'd probably simplify your application greatly by refactoring it to make more extensive use of partial application and parametric context, which fall into more normal data use patterns and which will garbage collect normally.

I assume, you're doing (remove-ns) on your namespace, so if you look at clojure.core in clojure source code, it calls Namespace.remove on namespace symbol that in turn gets removed from concurrent hashmap. If symbols are tied only to the namespace, they should be cleaned by gc then.
Have a look at clojure code of the version you're using to be sure, and ofc if you're in doubt at best test it out for yourself by attaching a profiler to see if those objects are in fact removed after this function call.

Related

Is it OK to use global-scope objects (structs) in C++ (C)?

As we all know, it is a bad habit to use global variables unnecessarily and there is a tendency to keep the scope as small as possible.
But what about objects? Or structure instantiations of a similar function in C. Is there anything wrong with using global objects across multiple source files?
Thanks for sheding some light to this issue as I just got a bit... mangled.
summary
Global scalar type variables are a tool and tools are not good or bad, their use is appropriate or inappropriate. Adding the power of classes to that tool is not good or bad, it has the potential to improve the appropriateness or deteriorate the appropriateness.
Objects
An answer requires a definition of object. Object is not the complement of variable. The complement of variable is constant. I will use "scalar types" as the complement of object.
The problems of using global scalar type variables
It seems that the problems of global variables are in general something along
non-locality
no constraint checking
coupling
concurrency problems
namespace pollution
testing and debugging
I think as long as we talk about scalar types memory footprint is an argument to be considered obsolete at least for PCs, embedded might be a different thing.
On remark: constants do not have all of this problems, thats why global constants are far more common than global variables.
The problems of using global objects
You have to ask yourself whether any of this problems is eliminated by using objects instead of scalar type variables
All of those problems apply to objects as well.
Plus they add all the complexities of classes on top and make every single of those objections worse, at least in general.
With objects you just have more locations to break everything, just some locations more that hinder debugging by side effects. Objects have the tendency to become a little bit more complex, adding to potential consistent initialization problems.
With objects you suddenly not only have to manage access or not, but you have to make sure that every single point of access is compatible to the current interface of your object and the constraints of the underlying data type. Make sure that parts of the program manage ownership correctly.
I think it is not a matter of opinion: If you consider global scalar type variables to be bad you will have to consider global objects to be bad a fortiori.
cum grano salis
Of course there are rightful and perfectly justified applications of global scalar type variables as well as of global objects.
And if several global variables are logically connected I think bundling those connections in a global object might tackle some of the objections. E.g. namespace pollution is of course reduced when you move some global scalar type variables in an object. And if there is a invariant in these variables of course the class of a global object is the first place you would be looking for that and a class would definitely the best place to code that invariant. Setters might be a tool to soothe the problem of missing constraints.

Should I pass my service locator to all my objects when they need them?

I recently decided that a service locator would be an okay design pattern to access important managers for my game like a world manager (can spawn entities and keeps track of them) and a sound manager. However, I am not sure of the most appropriate way to have access to the service locator. I started by passing a pointer to the service locator I instanced in main, but this is becoming tedious, as I have found everything (projectiles, players, everything!) is needing it in it's arguments.
I am asking this here because I don't think that it is specific to games, but if I am wrong, just let me know.
Am I going about this pattern the wrong way?
Thanks for your time.
EDIT: Would a singleton solve this problem? They have a global point of access, but I don't think that it is the cleanest solution. Any ideas? Or would that be best?
This is a situation where using a global variable (or a Singleton) may be appropriate. You have to weigh the disadvantages of using a global variable / a singleton against the convenience of not having to pass a reference nearly everywhere. If you feel the disadvantages are unlikely to affect your design/code, then using a global variable can make your code much cleaner.
Some disadvantages of using a global variable are:
Managing the lifetime of your global can be more difficult. If you define multiple global variables in different translation units (cpp files), the order in which they are instantiated is unspecified, so they'd better not rely on each other during instantiation. One solution would be to store global pointers and instantiate the objects somewhere early in your program (e.g. in main), but then you have to make sure you don't create dangling pointers during the destruction phase of your program.
You may need to synchronize access to your global variable(s) in a multi-threaded context. With global variables it's harder to use per-thread objects (or proxies) to prevent having to synchronize access.
Additional disadvantages of a singleton can be:
Not being able to create copies of your class, e.g. for saving or undo.
Personally, I am in favour of using a single context object everywhere instead of using singletons. Have the context object provide you with functions to give you pointers/references or access to all the different services, managers, etc.

Behavior of creating objects in ColdFusion

At one time I had a theory that instantiating objects on every request rather than having them reside in the Application scope was a huge memory hog. As my knowledge of ColdFusion has grown over the years, I don't think I really understood how CF deals with classes in the "black box" of the CF framework, so I'm going to ask this for community correction or confirmation.
I'm just going to throw out what I think is happening:
A CFC is compiled into a class, each method within that CFC is compiled into a class.
Those classes will reside in (PermGen) memory and can be written to disk based on CF administrator settings.
When a new object is created or template requested, the source code is hashed and compared to the hash stored with the compiled class.
If there is a match, it will use the compiled class in memory
If the compiled class doesn't exist, it will compile from source
If the compiled class exists, but the hash doesn't match, it will recompile.
As an aside, whenever you enable trusted cache, ColdFusion will no longer hash the source to check for differences and will continue to use the compiled class in memory.
Whenever you create a new object, you get a new pointer to the compiled class and its methods' classes and any runtime events occur in the pseudo-constructor. Edit: At this point, I'm referring to using createObject and having any "loose" code outside of functions run. When I say pointer, I mean the reference to memory allocated for the object's scopes (this, variables, function variables).
If you request an init, then the constructor runs. The memory consumed at this point is just your new reference and any variables set in the pseudo-constructor and constructor. You are not actually taking up memory for a copy of the entire class. Edit: For this step I'm referring to using the new operator or chaining your createObject().init() old school.
This eliminates a huge fallacy that I, personally, might have heard over the years that instantiating large objects in every request is a massive memory hog (due to having a copy of the class rather than just a reference). Please note that I am not in favor of this, the singleton pattern is amazing. I'm just trying to confirm what is going on under the hood to prevent chasing down red herrings in legacy code.
Edit: Thanks for the input everyone, this was a really helpful Q/A for me.
I've been developing CF for 14 years and I've never heard anyone claim that creating CFC instances on each request consumed memory due to class compilation. At the Java level, your CFML code is direct compiled to bytecode and stored as Java classes in memory and on disk. Java classes are not stored in the heap, but rather in the permanent generation which is not (usually) a collected memory space. You can create as many instances of that CFC and no more perm gen space will be used, however heap space will be allocated to store the instance data for that CFC for the duration of its existsance. Note, open source Railo does not use separate classes for methods.
Now, if you create a very large amount of CFC instances (or any variable) for that matter, that will create a lot of cruft in your heap's young generations. As long as hard references are not held after the request finishes, those objects will be cleared from the heap when the next minor garbage collection runs. This isn't necessarily a bad thing, but heap sizes and GC pauses should always be taken into account when performance tuning an application.
Now, there are reasons to persist CFC instances, either as a singleton pattern or for the duration of a session, request, etc. One reason is the overhead of actual object creation. This often involves disk I/O to check last modified times. Object creation has increased speed significantly since the old days, but is still pretty far behind native Java if you're going to be creating thousands of instances. The other main reason is for your objects to maintain state over the life of the application/session/request such as a shopping cart stored in session while the user shops.
And for completeness, I'll attempt to address your points categorically:
For Adobe CF yes, for Railo, methods are inner classes
Yes.
Actually, I don't believe there is any hashing involved. It's all based on the datetime last modified on the source file.
Yes, but again, no hashing-- it just skips the disk I/O to check the last modified datetime
I don't think "pointer" is the right term as that implies the Java classes actually live in the heap. CF uses a custom URL classloader to load the class for the template and then an INSTANCE of that class is created and stored in the heap. I can understand how this may be confusing as CFML has no concept of "class". Everything is simply an instance or doesn't exist at all. I'm not sure what you mean by "runtime events occur[ing] in the pseudo-constructor".
To be clear, the JAVA constructor already ran the instant you created the CFC. The CF constructor may be optional, but it has zero bearing on the memory consumed by the CFC instance. Again, I think you're getting unnecessarily hung up on the pseudo-constructor as well. That's just loose code inside the component that runs when it is created and has no bearing on memory allocated in the heap. The Java class is never copied, it is just the template for the instance.

Global object and creation order

I'm still learning C++. I have one problem. Lets say that your project has global object which always exists e.g ApiManager and all other modules have access to it (by #include). For now I'm doing it by:
Header:
class ApiManager : public QObject
{
Q_OBJECT
public:
explicit ApiManager(QObject *parent = 0);
signals:
public slots:
};
extern ApiManager apiMng;
Source:
ApiManager apiMng;
The problem is that other objects need to have access when initialized too and I noticed that C++ global objects are created alphabetically. I'm wondering how do you deal with it? Exists some trick for this? For example in Free Pascal world each class module has initialization and finalization sections:
Type
TApiManager = class
end;
var ApiMng: TApiManager;
initialization
ApiMng := TApiManager.Create;
finalization
ApiMng.Free;
... and initialization order of project modules can be sorted in project source in uses clause (like #include in C++). I know that there is a lot of ways to do this (for example initialize everything in main.cpp with custom order) but want to know what is a "good habit" in C++ world
Edit: Solved by Q_GLOBAL_STATIC (introduced in Qt 5.1 but work for Qt 4.8 too) but still have two issues:
Still don't know how to manage constructor orders (and where to initialize it). Because global objects created by Q_GLOBAL_STATIC are not created at application startup. They are created on first usage. So I need to "touch" these object somewhere (in main.cpp?) with my custom order.
Documentation is saying that Q_GLOBAL_STATIC must be called in body .cpp file, not in header. But then other classes do not see this object. So I created static function which expose reference to this object:
.cpp:
Q_GLOBAL_STATIC(ApiManager, apiMng)
ApiManager *ApiManager::instance()
{
return apiMng();
}
But from this topic: http://qt-project.org/forums/viewthread/13977 Q_GLOBAL_STATIC should expose instance automatically, but it doesn't
They are not initialized in alphabetical order, and the initialization order among the translation units are undefined as nothing is guaranteed by the standard about it.
Why global variables are evil
Global variables should be avoided for several reasons, but the primary reason is because they increase your program’s complexity immensely. For example, say you were examining a program and you wanted to know what a variable named g_nValue was used for. Because g_nValue is a global, and globals can be used anywhere in the entire program, you’d have to examine every single line of every single file! In a computer program with hundreds of files and millions of lines of code, you can imagine how long this would take!
Second, global variables are dangerous because their values can be changed by any function that is called, and there is no easy way for the programmer to know that this will happen.
Why Global Variables Should Be Avoided When Unnecessary
Non-locality -- Source code is easiest to understand when the scope of its individual elements are limited. Global variables can be read or modified by any part of the program, making it difficult to remember or reason about every possible use.
No Access Control or Constraint Checking -- A global variable can be get or set by any part of the program, and any rules regarding its use can be easily broken or forgotten. (In other words, get/set accessors are generally preferable over direct data access, and this is even more so for global data.) By extension, the lack of access control greatly hinders achieving security in situations where you may wish to run untrusted code (such as working with 3rd party plugins).
Implicit coupling -- A program with many global variables often has tight couplings between some of those variables, and couplings between variables and functions. Grouping coupled items into cohesive units usually leads to better programs.
Concurrency issues -- if globals can be accessed by multiple threads of execution, synchronization is necessary (and too-often neglected). When dynamically linking modules with globals, the composed system might not be thread-safe even if the two independent modules tested in dozens of different contexts were safe.
Namespace pollution -- Global names are available everywhere. You may unknowingly end up using a global when you think you are using a local (by misspelling or forgetting to declare the local) or vice versa. Also, if you ever have to link together modules that have the same global variable names, if you are lucky, you will get linking errors. If you are unlucky, the linker will simply treat all uses of the same name as the same object.
Memory allocation issues -- Some environments have memory allocation schemes that make allocation of globals tricky. This is especially true in languages where "constructors" have side-effects other than allocation (because, in that case, you can express unsafe situations where two globals mutually depend on one another). Also, when dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.
Testing and Confinement - source that utilizes globals is somewhat more difficult to test because one cannot readily set up a 'clean' environment between runs. More generally, source that utilizes global services of any sort (e.g. reading and writing files or databases) that aren't explicitly provided to that source is difficult to test for the same reason. For communicating systems, the ability to test system invariants may require running more than one 'copy' of a system simultaneously, which is greatly hindered by any use of shared services - including global memory - that are not provided for sharing as part of the test.
In general, please avoid global variables as a rule of thumb. If you do need to have them, please use Q_GLOBAL_STATIC.
Creates a global and static object of type QGlobalStatic, of name VariableName and that behaves as a pointer to Type. The object created by Q_GLOBAL_STATIC initializes itself on the first use, which means that it will not increase the application or the library's load time. Additionally, the object is initialized in a thread-safe manner on all platforms.
You can also use Q_GLOBAL_STATIC_WITH_ARGS. Here you can find some inline highlight from the documentation:
Creates a global and static object of type QGlobalStatic, of name VariableName, initialized by the arguments Arguments and that behaves as a pointer to Type. The object created by Q_GLOBAL_STATIC_WITH_ARGS initializes itself on the first use, which means that it will not increase the application or the library's load time. Additionally, the object is initialized in a thread-safe manner on all platforms.
Some people also tend to create a function for wrapping them, but they do not reduce the complexity significantly, and they eventually either forget to make those functions thread-safe, or they put more complexity in. Forget about doing that as well when you can.
The initialization order of global objects is only defined within a translation unit (there it is top to bottom). There is no guarantee between translation units. The typical work-around is to wrap the object into a function and return a reference to a local object:
ApiManager& apiMng() {
static ApiManager rc;
return rc;
}
The local object is initialized the first time the function is called (and, when using C++11 also in a thread-safe fashion). This way, the order of construction of globally accessed objects can be ordered in a useful way.
That said, don't use global objects. They are causing more harm than good.
Good habit in C++ world would be to avoid global objects at all costs - the more localized is the object the better it is.
If you absolutely have to have global object, I think the best would be to initialize objects in custom order in main - to be explicit about initialization order. Fact that you are using qt is one more argument towards initializing in main - you probably would want to initialize QApplication (which requires argc and argv as input arguments) prior to any other QObject.

When are global variables actually considered good/recommended practice?

I've been reading a lot about why global variables are bad and why they should not be used. And yet most of the commonly used programming languages support globals in some way.
So my question is what is the reason global variables are still needed, do they offer some unique and irreplaceable advantage that cannot be implemented alternatively? Are there any benefits to global addressing compared to user specified custom indirection to retrieve an object out of its local scope?
As far as I understand, in modern programming languages, global addressing comes with the same performance penalty as calculating every offset from a memory address, whether it is an offset from the beginning of the "global" user memory or an offset from a this or any other pointer. So in terms of performance, the user can fake globals in the narrow cases they are needed using common pointer indirection without losing performance to real global variables. So what else? Are global variables really needed?
Global variables aren't generally bad because of their performance, they're bad because in significantly sized programs, they make it hard to encapsulate everything - there's information "leakage" which can often make it very difficult to figure out what's going on.
Basically the scope of your variables should be only what's required for your code to both work and be relatively easy to understand, and no more. Having global variables in a program which prints out the twelve-times tables is manageable, having them in a multi-million line accounting program is not so good.
I think this is another subject similar to goto - it's a "religious thing".
There is a lot of ways to "work around" globals, but if you are still accessing the same bit of memory in various places in the code you may have a problem.
Global variables are useful for some things, but should definitely be used "with care" (more so than goto, because the scope of misuse is greater).
There are two things that make global variables a problem:
1. It's hard to understand what is being done to the variable.
2. In a multithreaded environment, if a global is written from one thread and read by any other thread, you need synchronisation of some sort.
But there are times when globals are very useful. Having a config variable that holds all your configuration values that came from the config file of the application, for example. The alternative is to store it in some object that gets passed from one function to another, and it's just extra work that doesn't give any benefit. In particular if the config variables are read-only.
As a whole, however, I would suggest avoiding globals.
Global variables imply global state. This makes it impossible to store overlapping state that is local to a given part or function in your program.
For example, let stay we store the credentials of a given user in global variables which are used throughout our program. It will now be a lot more difficult to upgrade our program to allow multiple users at the same time. Had we just passed a user's state as a parameter, to our functions, we would have had a lot less problems upgrading to multiple users.
my question is what is the reason global variables are still needed,
Sometimes you need to access the same data from a lot of different functions. This is when you need globals.
For instance, I am working on a piece of code right now, that looks like this:
static runtime_thread *t0;
void
queue_thread (runtime_thread *newt)
{
t0 = newt;
do_something_else ();
}
void
kill_and_replace_thread (runtime_thread *newt)
{
t0->status = dead;
t0 = newt;
t0->status = runnable;
do_something_else ();
}
Note: Take the above as some sort of mixed C and pseudocode, to give you an idea of where a global is actually useful.
Static Global is almost mandatory when writing any cross platform library. These Global Variables are static so that they stay within the translation unit. There are few if any cross platform libraries that does not use static global variables because they have to hide their platform specific implementation to the user. These platform specific implementations are held in static global variables. Of course, if they use an opaque pointer and require the platform specific implementation to be held in such a structure, they could make a cross platform library without any static global. However, such an object needs to be passed to all functions within such a library. Therefore, you have a pass this opaque pointer everywhere, or make static global variables.
There's also the identifier limit issue. Compilers (especially older ones) have a limit to the number of identifiers they could handle within a scope. Many operating systems still use tons of #define instead of enumerations because their old compilers cannot handle the enumeration constants that bloat their identifiers. A proper rewrite of the header files could solve some of these.
Global variables are considered when you want to use them in every function including main. Also remember that if you initialize a variable globally, its initial value will be same in every function, however you can reinitialize it inside a function to use a different value for that variable in that function. In this way you don't have to declare the same variable again and again in each function. But yes they can cause trouble at times.
List item
Global names are available everywhere. You may unknowingly end up using a global when you think you are using a local
And if you make a mistake while declaring a global variable, then you'll have to apply the changes to the whole program like if you accidentally declared it to be int instead of float