Are Hashes ever on the stack in Crystal lang? - crystal-lang

In the Crystal language, are Hashes ever allocated on the stack? Or are they always 'heaped'? I could not find anything in the docs (https://crystal-lang.org/api/0.33.0/Hash.html - looked up on 19 Feb 2020). I see quite a few malloc_* in https://github.com/crystal-lang/crystal/blob/master/src/hash.cr, but wasn't sure if there was optimization I was missing. I don't think the docs call it out explicitly - did a word-find in browser for 'heap', 'stack' and 'allocate' on https://crystal-lang.org/api/0.33.0/Hash.html... couldn't find anything.

Hashes are always heap allocated.
In Crystal, it depends on whether a object is of type Reference or Value. All objects that inherit from Reference are always allocated on the heap.
A hash is defined as class Hash(K, V). As class always inherits from Reference, hashes will always be heap allocated.
Value types like struct Int32, where stack allocation is desirable, have to be defined as structs, so they do not inherit from Reference but from Value.

Related

C++ vector at(#) replacement clean up memory?

The pains of compiled libraries is never knowing exactly how the memory is managed.
I'm lead to believe that vector's elements are placed on the heap unless explicity told not to.
Being placed on the heap, it obviously needs to be deleted when it is no longer used, which seems to happen when the vector object is deleted.
The question is when the at(#) or operator[] is called does it delete the memory being replaced?
For example:
std::vector<string> secretlyAnArray(5);
secretlyAnArray.at(0) = std::string("Does this memory leak?");
secretlyAnArray.at(0) = std::string("When I overwrite the object?")
Happy to learn better methods to replace data at a specific index of a vector, or just pointing at the documentation that explains it.
Edit 2:
After the helpful comments of Anis and Daniel which are much appreciated; it appears that at(#) returns a reference, which then standard reference rules apply rather than governed by the behavour of vector.
std::vector<std::string> secretlyAnArray(5);
secretlyAnArray.at(0) = std::string("Does this memory leak?");
secretlyAnArray.at(0) = std::string("When I overwrite the object?")
This code assigns an object of type std::string to another object of type std::string. The std::vector itself isn't at all involved in this assignment. Here, it just provides a reference to the destination (assigned-to) object.
What does assignment do for a given type depends on its definition. There is no generic answer. For instance, with std::string, when you copy/move-assign, the original content of the destination object (its string of characters) needs to be "destroyed". std::string does it for you, so, you don't need to care about it. It means that if the string is "long" (with regard to short string optimization; SSO), the memory will be correctly deallocated, if needed.

Scenarios where we force to use Pointers in c++

I had been in an interview and asked to give an example or scenario in CPP where we can't proceed without pointers, means we have to use pointer necessarily.
I have given an example of function returning array whose size is not known then we need to return pointer that is name of the array which is actually a pointer. But the interviewer said its internal to array give some other example.
So please help me with some other scenarios for the same.
If you are using a C Library which has a function that returns a pointer, you have to use pointers then, not a reference.
There are many other cases (explicitly dealing with memory, for instance) - but these two came to my mind first:
linked data-structures
How: You need to reference parts of your structure in multiple places. You use pointers for that, because containers (which also use pointers internally) do not cover all your data-structure needs. For example,
class BinTree {
BinTree *left, *right;
public:
// ...
};
Why there is no alternative: there are no generic tree implementations in the standard (not counting the sorting ones).
pointer-to-implementation pattern (pimpl)
How: Your public .hpp file has the methods, but only refers to internal state via an opaque Whatever *; and your internal implementation actually knows what that means and can access its fields. See:
Is the pImpl idiom really used in practice?
Why there is no alternative: if you provide your implementation in binary-only form, users of the header cannot access internals without decompiling/reverse engineering. It is a much stronger form of privacy.
Anyplace you would want to use a reference, but have to allow for null values
This is common in libraries where if you pass a non zero pointer, it will be set to the value
It is also a convention to have arguments to a function that will be changed to use a pointer, rather than a reference to emphasize that the value can be changed to the user.
Here are some cases:
Objects with large lifetime. You created some object in function. You need this object afterwards (not even copy of it).
But if you created it without pointers, on stack - after function would finish, this object would die. So you need to create this object using dynamic memory and return pointer to it.
Stack space is not enough. You need object which needs lot of memory, hence allocating it on the stack won't fit your needs, since stack has less space than heap usually. So you need to create the object again using dynamic memory on heap and return pointer to it.
You need reference semantics. You have structure which you passed to some function and you want the function to modify this structure, in this case you need to pass a pointer to this structure, otherwise you can't modify the original structure, since copy of it will be passed to the function if you don't use pointers.
Note: in the latter case, indeed using pointer is not necessary, since you can substitute it using reference.
PS. You can browse here for more scenarios, and decide in which cases are pointer usages necessary.
pointers are important for performance example of this are for functions. originally when you pass a value in a function it copies the value from the argument and stores to the parameter
but in pointers you can indirectly access them and do what you want

Find pointers from pointee

From this code:
int x = 5;
int other = 10;
vector<int*> v_ptr;
v_ptr.push_back(&x);
v_ptr.push_back(&other);
v_ptr.push_back(&x);
Is there anyway I can know who points at x, from the x variable itself, so that I don't have to search inside v_ptr for address of x? Is this possible in C++ or C++0x?
I read that when doing garbage collection, they look at memory and see if anything points at it or not, then make decisions to delete the unused variable, etc.
No. It is like asking a person if they know everyone who knows their address.
No, you can not know what has a reference to x without iterating through the possible places you assigned it(v_ptr)
--Or--
If you must do this, you may want to do some kinda reference tracking(which can be used for garbage collection) like
v_ptr[0]=add_reference(&x,&v_ptr[0]);
where add_reference is some function to have a list of references made to the first argument, with the referrer as the second argument(which may be tricky with STL types)
No it is not possible to know who points at x.
In addition C++ is not garbage collected.
Even if you use shared_pointer's to x, you can find out how many pointers there are to x, but not whom they are.
No, it is not possible to know this with a raw pointer.
Certain types of "smart pointers" (which are actually objects that contain pointers plus other meta-data about the pointer) keep as part of their meta-data a list or count of all references to the pointed-to object from other smart pointers. In garbage collected languages, this mechanism is used to determine if an object is no longer referenced, but it is not a characteristic of a standard C or C++ pointer.
In addition to the other accurate answers here, note that in garbage collected languages (C++.NET, I guess, or any of the normal other ones, Java/C#, etc), one technique for garbage collection is to traverse references, marking everything that is pointed to.
But note that this actually works the other direction. I start from a known set of objects, and follow all of their links to other objects, etc. I generally never am able to say "given this object, let me calculate who points to it or holds references to it".
The answer to your actual question is no, C++ doesn't have a mechanism for figuring out how many references are still active. Nonetheless,while C++ is not garbage collected, if you're interested, you can try one of the gc_classes. Here's a stackoverflow post listing some of them: Garbage collection Libraries in C++
Yes, you can know if--at a given point in execution--there is a pointer to your variable. All you need to do is keep track of the memory allocated to every variable in the process. This means knowing the start and end addresses of the stack and the heap. You can then do a simple sequential search for the location of your variable in those address ranges.
Though iterating over those relatively small portions of memory should not take long, you could optimize the search time at the expense of some additional memory overhead and pointer creation overhead by maintaining a structure that tracks only references. That gives you a smaller list to search.

Know what references an object

I have an object which implements reference counting mechanism. If the number of references to it becomes zero, the object is deleted.
I found that my object is never deleted, even when I am done with it. This is leading to memory overuse. All I have is the number of references to the object and I want to know the places which reference it so that I can write appropriate cleanup code.
Is there some way to accomplish this without having to grep in the source files? (That would be very cumbersome.)
A huge part of getting reference counting (refcounting) done correctly in C++ is to use Resource Allocation Is Initialization so it's much harder to accidentally leak references. However, this doesn't solve everything with refcounts.
That said, you can implement a debug feature in your refcounting which tracks what is holding references. You can then analyze this information when necessary, and remove it from release builds. (Use a configuration macro similar in purpose to how DEBUG macros are used.)
Exactly how you should implement it is going to depend on all your requirements, but there are two main ways to do this (with a brief overview of differences):
store the information on the referenced object itself
accessible from your debugger
easier to implement
output to a special trace file every time a reference is acquired or released
still available after the program exits (even abnormally)
possible to use while the program is running, without running in your debugger
can be used even in special release builds and sent back to you for analysis
The basic problem, of knowing what is referencing a given object, is hard to solve in general, and will require some work. Compare: can you tell me every person and business that knows your postal address or phone number?
One known weakness of reference counting is that it does not work when there are cyclic references, i.e. (in the simplest case) when one object has a reference to another object which in turn has a reference to the former object. This sounds like a non-issue, but in data structures such as binary trees with back-references to parent nodes, there you are.
If you don't explicitly provide for a list of "reverse" references in the referenced (un-freed) object, I don't see a way to figure out who is referencing it.
In the following suggestions, I assume that you don't want to modify your source, or if so, just a little.
You could of course walk the whole heap / freestore and search for the memory address of your un-freed object, but if its address turns up, it's not guaranteed to actually be a memory address reference; it could just as well be any random floating point number, of anything else. However, if the found value lies inside a block a memory that your application allocated for an object, chances improve a little that it's indeed a pointer to another object.
One possible improvement over this approach would be to modify the memory allocator you use -- e.g. your global operator new -- so that it keeps a list of all allocated memory blocks and their sizes. (In a complete implementation of this, operator delete would have remove the list entry for the freed block of memory.) Now, at the end of your program, you have a clue where to search for the un-freed object's memory address, since you have a list of memory blocks that your program actually used.
The above suggestions don't sound very reliable to me, to be honest; but maybe defining a custom global operator new and operator delete that does some logging / tracing goes in the right direction to solve your problem.
I am assuming you have some class with say addRef() and release() member functions, and you call these when you need to increase and decrease the reference count on each instance, and that the instances that cause problems are on the heap and referred to with raw pointers. The simplest fix may be to replace all pointers to the controlled object with boost::shared_ptr. This is surprisingly easy to do and should enable you to dispense with your own reference counting - you can just make those functions I mentioned do nothing. The main change required in your code is in the signatures of functions that pass or return your pointers. Other places to change are in initializer lists (if you initialize pointers to null) and if()-statements (if you compare pointers with null). The compiler will find all such places after you change the declarations of the pointers.
If you do not want to use the shared_ptr - maybe you want to keep the reference count intrinsic to the class - you can craft your own simple smart pointer just to deal with your class. Then use it to control the lifetime of your class objects. So for example, instead of pointer assignment being done with raw pointers and you "manually" calling addRef(), you just do an assignment of your smart pointer class which includes the addRef() automatically.
I don't think it's possible to do something without code change. With code change you can for example remember the pointers of the objects which increase reference count, and then see what pointer is left and examine it in the debugger. If possible - store more verbose information, such as object name.
I have created one for my needs. You can compare your code with this one and see what's missing. It's not perfect but it should work in most of the cases.
http://sites.google.com/site/grayasm/autopointer
when I use it I do:
util::autopointer<A> aptr=new A();
I never do it like this:
A* ptr = new A();
util::autopointer<A> aptr = ptr;
and later to start fulling around with ptr; That's not allowed.
Further I am using only aptr to refer to this object.
If I am wrong I have now the chance to get corrections. :) See ya!

When to use pointers, and when not to use them

I'm currently doing my first real project in C++ and so, fairly new to pointers. I know what they are and have read some basic usage rules. Probably not enough since I still do not really understand when to use them, and when not.
The problem is that most places just mention that most people either overuse them or underuse them. My question is, when to use them, and when not?.
Currently, in many cases i'm asking myself, should I use a pointer here or just pass the variable itself to the function.
For instance, I know that you can send a pointer to a function so the function can actually alter the variable itself instead of a copy of it. But when you just need to get some information of the object once (for instance the method needs a getValue() something), are pointers usefull in that case?
I would love to see either reactions but also links that might be helpfull. Since it is my first time using C++ I do not yet have a good C++ book (was thinking about buying one if I keep on using c++ which I probably will).
For the do's and dont's of C++:
Effective C++ and More Effective C++ by Scott Meyers.
For pointers (and references):
use pass by value if the type fits into 4 Bytes and don't want to have it changed after the return of the call.
use pass by reference to const if the type is larger and you don't want to have it changed after the return of the call.
use pass by reference if the parameter can't be NULL
use a pointer otherwise.
dont't use raw pointers if you don't need to. Most of the time, a smart pointer (see Boost) is the better option.
From the c++ faq:
Use references when you can, and
pointers when you have to.
https://isocpp.org/wiki/faq/references#refs-vs-ptrs
1) I tend to use member variables scoped with the class. They are constructed in the initializer of the class, and I don't need to worry about pointers.
2) You can pass by reference to a function, and not worry about passing pointers. This effectively will pass a pointer to the method / function that can be used as if you passed the class, but without the overhead of copying the class itself.
3) If I need to control the lifetime of an object that is independent of my main application architecture's classes... then I will use an auto_ptr from the STL to automatically handle the pointer's destruction when no one longer references it. Check it out - it's the way to go.
Use it whenever you are dealing with allocated memory or passing arguments by reference to a method; I don't think there is a rule for not using pointers.
My rules of thumb:
Always pass function parameters as const references,
unless they are built-in types, in which case they are copied (and const/non-const becomes a question of style as the caller isn't affected) or
unless they are meant to be changed inside the function so that the changes reflect at the caller's, in which case they are passed by non-const reference or
unless the function should be callable even if callers don't have an object to pass, then they are passed as pointers, so that callers can pass in NULL pointers instead (apply #1 and #3 to decide whether to pass per const T* or per T*)
Streams must always be passed around as non-const references.
Generally, when you can use references instead of pointers it is a good idea. A reference must have a target (no NULL pointer violations), they allow the same semantics as pointers when being passed as arguments to a function, and they are generally nicer to use for beginners (or those not coming from a C background).
Pointers are required when you want to do dynamic allocation of memory; when you need to deal with an unknown amount of things that will be later specified. In this case the interface to access memory is through new and delete which deal in pointers.
My philosophy is to always pass by value, unless you need to modify the variable passed or copying the object is expensive. In both these cases, consider using a reference instead of a pointer first: if you don't need to change which object you're referencing, nor do you need a possible extremal value (NULL pointer), you can use a reference.
Don't forget about iterators either.
All good answers above. Additionally, if you are performing some processor-intensive work, it's important to realize that dereferencing a pointer will likely be a cache miss on your processor. It's a good idea to keep your data accessible with minimal pointer dereferences.
Class attribute: pointer
Variables declared in methods: no pointers, so we avoid memory leaks.
In this way, prevent memory leaks and controlle attribute's consistency.
Salu2.