Erlang memory variables management

Erlang memory variables management - list

we consider this Erlang example :
X1=[1,2,4,6....we consider there are 10 millions element],
X2=[2,6,5,2,...we consider there are 100 millions element],
X3=.......
.
.
X10000=.....
this code will allocate spaces for billions of elements so let's try this :
L=[X1, X2,....., X10000].
in Java "X1, X2,..." are just references toward memory allocations so this code in Java will allocate memory for the values of these variables and assign to the variables the memory's adresses to reference toward this values, so when we create the list L and we call X1.... the variables reference to the precedent allocation of memory and we have allocate memory just one time.
if we consider that "=" is
an expression and not an assignement between a variable and a memory adress (thay what Joe said in his book) variables X1,.... X10000 in the L list will be allocated in memory for the second time ?

From the book you were already recommended on a previous question, section 12.4.1:
Objects on the heap are passed by references within the context of one process. If you call one function with a tuple as an argument, then only a tagged reference to that tuple is passed to the called function. When you build new terms you will also only use references to sub terms.
L=[X1,...]
is building a new term, so it only uses references to X1 etc. and allocates enough new memory to make a list; it doesn't copy the lists referred to by X1 etc.
In both cases the list members are passed by reference. Now if you have
X1 = [1,2],
L = [X1, X1]
the situation is more interesting (but still explained by that quote); there are not two copies of [1, 2] in memory like there would be in
L = [[1, 2], [1, 2]]
instead both members of L point to the same list.
Fundamentally, Erlang allocates memory for values, not for variables, unlike C and its descendants including Java; a variable doesn't really correspond to a memory address.

Related

What is the result of the double use of the new int?

I've searched far and wide and I cannot work out what this produces. I've seen no other examples where "new int [] is used twice within one array. Can anyone help?
int *t [2] = { new int [2], new int [2] };

t is an array of two int*, which are, generically, pointers to int.
The new operator is allocating an array of 2 consecutive int on the heap, by returning a memory address to that allocated memory (int*). This is done twice, thus allocating two arrays and storing them in the outer array.

Since new int [2] gives you a heap-allocated array of two integers (each time you call it), you'll end up with an array t of integer pointers pointing to distinct arrays of integers, something like this:
(array) (points to) (arrays-on-heap)
t: [0] -> [int0, int1]
[1] -> [int2, int3]
Were you to print out &t, &(t[0]) and &(t[1]), you would find that was an array with the first two items the same and the third slightly higher (the size of an int *). This is because array elements are consecutively placed.
Printing out t[0] and t[1] may have wildly disparate values since they can come from anywhere on the heap. They probably won't be that different simply because consecutive heap allocations tend to be done from consecutive memory(a), but they're likely to be separated by some memory - this is because many common allocation strategies involve allocating blocks of a minimum size/resolution, and with inline control information between blocks.
Printing out &(t[0][0]) and &(t[0][1]) will again give you close values since they form part of an array.
Note that those paragraphs above are not all mandated to be true by the standard, they're just the most common scenarios. It's possible that allocation strategies may involve exact sizes and out-of-line control information, but it would be unlikely.
(a) There may be exceptions to this in optimised allocators if, for example, different-sized requests come from different pools, or there's a separate preferred pool per thread. But, in the general case here, that's unlikely.

What happens to previous data on new assignment?

Consider the following code -
{
int n = 3;
n = 5;
std::vector<int> v = { 1, 2, 3 .... 3242};
v = std::vector<int>{10000, 10001, 10002... 50000};
}
for primitive data types like int, it can simply overwrite the previous memory location with the new value and get away with it. However what happens to types like std::vector, where length of previous values might not be same as the new assignment.
In other words, what happens to the part { 1, 2, 3 .... 3000} when v is reassigned to new {10000, 10001, 10002... 50000} value. Does it simply throw away the previous value and reassign the internal pointers to new locations? Or does it overwrites the previous locations with new data as much as it can, and either reallocates new memory in case of large assignment or clears out existing memory in case of shorter assignment thus preserving the capacity() of initial vector?
Would this be preferable anywhere over clearing out the contents (v=vector<int>{}vs.clear()) instead because I saw this type of code somewhere?

However what happens to types like std::vector, where sizes of previous value might not be same as the new assignment.
I take it you mean that the length of the new data array might be different?
std::vector separates the concerns of its internal storage and how much of that storage is in use. If the new data has fewer, the same, or a few more elements, objects typically re-use the same storage. It's more complex than simply being overwritten because old objects will need their destructors called (if they are not PODS), but essentially, yes. They are overwritten (safely).
If you look at the source code of std::vector you'll see a lot of quite complex code covering all the cases you mention, plus some more you have not.
Writing an exception-safe, optimally-efficient vector is not trivial.
Unless you are interested in the implementation (because you want to improve it, maintain it, or are just curious) the documentation of std::vector's behaviour is sufficient to reason about expectations you may have of it.
Pay particular attention to which operations cause the iterators to be invalidated. This is a useful hint that internal objects are either being moved within storage, or new storage may be allocated.
link: http://en.cppreference.com/w/cpp/container/vector

does tcl lrange create under the hood a new copy or not of the original list

let us say that I have a massive list variable. If I reference a range from the list by using the lrange commands, is tcl creating a copy of the range, while retaining the original list (assuming that return value is not saved into a variable), or using some immutable reference "shtick" to save memory?
For example: Let us say that I have list variable biggie, and that ~99% of the memory footprint of my script is on saving the biggie list. Will this line will cause my script to almost double its' memory footprint?
foreach [ lrange $biggie 1 end-1 ]
Thanks

The lrange command copies the list elements. The amount of memory consumed might not double though; the elements that are present in both lists will be handled by reference. The memory that will be duplicated will be the memory to store the array of pointers to the elements; which will be 4 bytes per element on 32-bit systems and 8 bytes per element on 64-bit systems (plus a minuscule amount of fixed overhead).
Any string representation of the list(s) will not be shared at all.

Why am I being told that an array is a pointer? What is the relationship between arrays and pointers in C++?

My background is C++ and I'm currently about to start developing in C# so am doing some research. However, in the process I came across something that raised a question about C++.
This C# for C++ developers guide says that
In C++ an array is merely a pointer.
But this StackOverflow question has a highly-upvoted comment that says
Arrays are not pointers. Stop telling people that.
The cplusplus.com page on pointers says that arrays and pointers are related (and mentions implicit conversion, so they're obviously not the same).
The concept of arrays is related to that of pointers. In fact, arrays work very much like pointers to their first elements, and, actually, an array can always be implicitly converted to the pointer of the proper type.
I'm getting the impression that the Microsoft page wanted to simplify things in order to summarise the differences between C++ and C#, and in the process wrote something that was simpler but not 100% accurate.
But what have arrays got to do with pointers in the first place? Why is the relationship close enough for them to be summarised as the "same" even if they're not?
The cplusplus.com page says that arrays "work like" pointers to their first element. What does that mean, if they're not actually pointers to their first element?

There is a lot of bad writing out there. For example the statement:
In C++ an array is merely a pointer.
is simply false. How can such bad writing come about? We can only speculate, but one possible theory is that the author learned C++ by trial and error using a compiler, and formed a faulty mental model of C++ based on the results of his experiments. This is possibly because the syntax used by C++ for arrays is unconventional.
The next question is, how can a learner know if he/she is reading good material or bad material? Other than by reading my posts of course ;-) , participating in communities like Stack Overflow helps to bring you into contact with a lot of different presentations and descriptions, and then after a while you have enough information and experience to make your own decisions about which writing is good and which is bad.
Moving back to the array/pointer topic: my advice would be to first build up a correct mental model of how object storage works when we are working in C++. It's probably too much to write about just for this post, but here is how I would build up to it from scratch:
C and C++ are designed in terms of an abstract memory model, however in most cases this translates directly to the memory model provided by your system's OS or an even lower layer
The memory is divided up into basic units called bytes (usually 8 bits)
Memory can be allocated as storage for an object; e.g. when you write int x; it is decided that a particular block of adjacent bytes is set aside to store an integer value. An object is any region of allocated storage. (Yes this is a slightly circular definition!)
Each byte of allocated storage has an address which is a token (usually representible as a simple number) that can be used to find that byte in memory. The addresses of any bytes within an object must be sequential.
The name x only exists during the compilation stage of a program. At runtime there can be int objects allocated that never had a name; and there can be other int objects with one or more names during compilation.
All of this applies to objects of any other type, not just int
An array is an object which consists of many adjacent sub-objects of the same type
A pointer is an object which serves as a token identifying where another object can be found.
From hereon in, C++ syntax comes into it. C++'s type system uses strong typing which means that each object has a type. The type system extends to pointers. In almost all situations, the storage used to store a pointer only saves the address of the first byte of the object being pointed to; and the type system is used at compilation time to keep track of what is being pointed to. This is why we have different types of pointer (e.g. int *, float *) despite the fact that the storage may consist of the same sort of address in both cases.
Finally: the so-called "array-pointer equivalence" is not an equivalence of storage, if you understood my last two bullet points. It's an equivalence of syntax for looking up members of an array.
Since we know that a pointer can be used to find another object; and an array is a series of many adjacent objects; then we can work with the array by working with a pointer to that array's first element. The equivalence is that the same processing can be used for both of the following:
Find Nth element of an array
Find Nth object in memory after the one we're looking at
and furthermore, those concepts can be both expressed using the same syntax.

They are most definitely not the same thing at all, but in this case, confusion can be forgiven because the language semantics are ... flexible and intended for the maximum confusion.
Let's start by simply defining a pointer and an array.
A pointer (to a type T) points to a memory space which holds at least one T (assuming non-null).
An array is a memory space that holds multiple Ts.
A pointer points to memory, and an array is memory, so you can point inside or to an array. Since you can do this, pointers offer many array-like operations. Essentially, you can index any pointer on the presumption that it actually points to memory for more than one T.
Therefore, there's some semantic overlap between (pointer to) "Memory space for some Ts" and "Points to a memory space for some Ts". This is true in any language- including C#. The main difference is that they don't allow you to simply assume that your T reference actually refers to a space where more than one T lives, whereas C++ will allow you to do that.
Since all pointers to a T can be pointers to an array of T of arbitrary size, you can treat pointers to an array and pointers to a T interchangably. The special case of a pointer to the first element is that the "some Ts" for the pointer and "some Ts" for the array are equal. That is, a pointer to the first element yields a pointer to N Ts (for an array of size N) and a pointer to the array yields ... a pointer to N Ts, where N is equal.
Normally, this is just interesting memory crapping-around that nobody sane would try to do. But the language actively encourages it by converting the array to the pointer to the first element at every opportunity, and in some cases where you ask for an array, it actually gives you a pointer instead. This is most confusing when you want to actually use the array like a value, for example, to assign to it or pass it around by value, when the language insists that you treat it as a pointer value.
Ultimately, all you really need to know about C++ (and C) native arrays is, don't use them, pointers to arrays have some symmetries with pointers to values at the most fundamental "memory as an array of bytes" kind of level, and the language exposes this in the most confusing, unintuitive and inconsistent way imaginable. So unless you're hot on learning implementation details nobody should have to know, then use std::array, which behaves in a totally consistent, very sane way and just like every other type in C++. C# gets this right by simply not exposing this symmetry to you (because nobody needs to use it, give or take).

Arrays and pointers in C and C++ can be used with the exact same semantics and syntax in the vast majority of cases.
That is achieved by one feature:
Arrays decay to pointers to their first element in nearly all contexts.
Exceptions in C: sizeof, _Alignas, _Alignas, address-of &
In C++, the difference can also be important for overload-resolution.
In addition, array notation for function arguments is deceptive, these function-declarations are equivalent:
int f(int* a);
int f(int a[]);
int f(int a[3]);
But not to this one:
int f(int (&a)[3]);

Besides what has already been told, there is one big difference:
pointers are variables to store memory addresses, and they can be incremented or decremented and the values they store can change (they can point to any other memory location). That's not the same for arrays; once they are allocated, you can't change the memory region they reference, e.g. you cannot assign other values to them:
int my_array[10];
int x = 2;
my_array = &x;
my_array++;
Whereas you can do the same with a pointer:
int *p = array;
p++;
p = &x;

The meaning in this guide was simply that in C# an array is an object (perhaps like in STL that we can use in C++), while in C++ an array is basically a sequence of variables located & allocated one after the other, and that's why we can refer to them using a pointer (pointer++ will give us the next one etc.).

it's as simple as:
int arr[10];
int* arr_pointer1 = arr;
int* arr_pointer2 = &arr[0];
so, since arrays are contiguous in memory, writing
arr[1];
is the same as writing:
*(arr_pointer+1)
pushing things a bit further, writing:
arr[2];
//resolves to
*(arr+2);
//note also that this is perfectly valid
2[arr];
//resolves to
*(2+arr);

Assigning specific/explicit memory locations to my objects

Can i make an object in C++ specifying its memory address explicitly? This is because i am having separate ids for each of my entities (the objects). So if i can do this, i will be able to traverse through all my objects by mere pointer additions.
Consider:
I have an object with memory location x.
I want to create the next object with memory location x+(the unique id of the next object)*K
where K is the constant gap between two objects(say)

You can specify the memory using the placement new operator.
So if i can do this, i will be able to traverse through all my objects by mere pointer additions.
Not really. Disregard answers that tell you to do this! You can't do pointer arithmetics outside of an array. Just because you have 2 objects o1 and o2, one located at 0x4 and the other at 0x5, it doesn't mean that &o1 + 1 will yield &o2. In fact, it's undefined behavior.
For this to work as expected, you can allocate the memory dynamically, or, better yet, use a std::vector and use iterators. (that's what they are for)

C++ placement new operator does what you want. Just make sure you allocate them enough memory.
See: http://www.parashift.com/c++-faq-lite/dtors.html#faq-11.10
The example from above link:
char memory[sizeof(Fred)];
void* place = memory;
Fred* f = new(place) Fred();

You can use Andrews answer and allocate an array of Objects.
In C++, You can also use a special form of the new operator to create an object at a preallocated address:
char memory[sizeof(MyClass[100])]; // memory for an array of 100 MyClass-objects
// Cast to MyClass-Array to let the compiler calculate
// the start addresses of the array elements.
MyClass * array = reinterpret_cast<MyClass *>(memory);
// create object with index 5
MyClass * pointerToSixthEntry= new (&array[5]) MyClass();
// Use the object by its index (no pointer arithmetic needed)
array[5].sayHelloWorld();
This is called placement new.
It is ugly and I wouldn't do it. If it's not really memory/runtime critical, You could think about using a STL map with the index as key. As a positive side effect, You get the tracking of used/unused IDs for free.
Using a preallocated vector might also help. Check Luchian's answers and comments.

Qestion on your ids, are these also consecutive and no missing values?
if so
I think the std::map suits your needs,
the keys (your ids) are sorted, (just insert within recursion)
and you have the ability to use iterators (pointer arithmetic).
http://www.cplusplus.com/reference/stl/map/

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js