How to print memory addresses in OCaml? - ocaml

Say I have a variable:
let a = ref 3 in magic_code
Magic_code should print the address in memory that is stored in a. Is there something like that? I googled this but nothing came up...

This should work:
let a = ref 3 in
let address = 2*(Obj.magic a) in
Printf.printf "%d" address;;
OCaml distinguishes between heap pointers and integers using the least significant bit of a word, 0 for pointers and 1 for integers (see this chapter in Real World OCaml).
Obj.magic is a function of type 'a -> 'b that lets you bypass typing (i.e. arbitrarily "cast"). If you force OCaml to interpret the reference as an int by unsafely casting it via Obj.magic, the value you get is the address shifted right by one bit. To obtain the actual memory address, you need to shift it back left by 1 bit, i.e. double the value.
Also see this answer.

Related

Passing &variable to an function

I have question about line #2 and #9 of this code.
I have ran this code on codeblocks and it turns out that trip(&y) runs and make y=21 at the end of line 9. This is not what I expected.
I thought &y would return the address (like hexadecimal number) and pass it to trip. Then trip will triple the weird (hexadecimal) number of the address and perhaps change the address of int y, or produce error.
Which part of my logic has fault?
My logics are:
&y returns the address of the variable (hexadecimal).
trip (int* x) takes in the number (address in this case) and triples it.
Therefore, nothing has been done to the actual value of y, which is 7.
Edit: The answer is 105.
You are right about the address of y that is given to the function trip(int*).
Your mistake is that the function trip(int*) dereferences the address to access the value it points to.
Assume y has the value 7 and is stored at address 0x0014 (simplified for your convenience).
The function trip(int*) gets the address 0x0014 and uses it to get to the value 7 of y.
Now it triples this value and stores the result where the address is pointing to. Thus overwriting the value 7 of y with the new value of 21.
This dereferencing of the address is done by the star (*) right in front of the x in the body of the function trip(int*).
To clarify. Addresses aren't necessarily a hexadecimal value. Hexadecimal is just a way to display a number. Addresses are just the number to a certain place in memory. In this case 0x0014 is the same as 20 in decimal. Meaning y is stored at address 20 (as decimal).
Please keep in mind that pointers are rarely the right way to do something. Like in this case with the function quin(int&). The problem that is solved with a pointer in trip(int*), is solved much cleaner in quin(int&). References are cleaner and safer.
The function trip(int*) does not check for the case that the address given is invalid (the address given is NULL/nullptr). quin(int&) doesn't even have to check for this since references cannot be NULL in C++.
If you have to use pointers, consider using smart pointers like std::unique_ptr<T> and std::shared_ptr<T>.

Converting arrays of one type to another

Basically I have an array of doubles. I want to pass this array to a function (ProcessData) which will treat them as short ints. Is creating a short pointer and pointing it to the array, then passing this pointer to the function ok (code 1) ?
Is this in effect the same as creating a short array, iterating through each element and converting each element of the double array to a short and then passing the short array pointer (code 2) ? Thanks
//code 1
//.....
short* shortPtr = (short*)doubleArr;
ProcessData(shortPtr);
..
//code 2
//...
short shortArr [ARRSIZE];
int i;
for (i = 0; i < ARRSIZE; i++)
{
shortArr[i] = (short)doubleArr[i];
}
ProcessData(shortArr);
You can't just cast, as the various comments have said. But if you use iterators you can get more or less the same effect:
void do_something_with_short(short i) {
/* whatever */
}
template <class Iter>
void do_something(Iter first, Iter last) {
while (first != last)
do_something_with_short(*first++);
}
You can call that template function with iterators into an array of any arithmetic type (in fact, any type that's implicitly convertible to short or, if you add a cast at the point of the call to do_something_with_short, with a type that requires a cast):
double data[10]; // needs to be initialized, of course
do_something(std::begin(data), std::end(data));
No you can't do that. Here's at least one reason why:
An array is a contiguous sequence of several memory allocations accessed by way of an index, like so
[----][----][----]
Note the four dashes inside the square brackets. That is to indicate that in most situations in C/C++, an int is four bytes long. Arrays cells can be accessed by their index because if we know the memory address of the first cell (m) and we know how big each cell is meant to be (c) - in this case, four bytes, we can easily find the memory location of any index by doing m + index * c
[----][----][----]
^ array[0]
[----][----][----]
---- ---- ^ array[2]
Fundamentally, this is why pointers can be treated like arrays in C/C++, because when you are accessing arrays, you are basically doing pointer arithmetic anyway.
In most cases in C/C++, a short is 2 bytes long, so to represent it in the same way
[--][--][--]
If you create a short pointer, and try to use it as an array, it is expected to point to something which is arranged like the above. If you try to index it, it is going to have problems: if you were dealing with an array of shorts, the location of array[2] is the same as m + 2 * index, as shown below
[--][--][--]
-- -- ^ array[2] (note moving along four bytes)
But since we are in reality dealing with an array of integers, the following will happen
[----][----][----]
---- ^ array[2] (again moving along four bytes)
Which is clearly wrong
No, because ++ptr actually does something like ptr = (char*)ptr + sizeof *ptr (with sizeof (char) being 1 by definition). So incrementing a double pointer moves it by (usually) 8 bytes, while incrementing a short pointer moves it by only 2 bytes.
Suppose that your kids study piano and occasionally ask you to scan for them a stack of sheet music given to them by their teacher who was born in the 20th century (just like yourself). You take those sheets to your office and feed them to the photocopier. It creates decent digital scans that your kids can use on their piano equipped with a touch screen. All goes well until one day the child brings to you an old rare set of vinyl records. She's desperate of finding those melodies in sheet music form but asks you to at least copy the records. Inexperienced in musical matters, you take those disks to your office, load them in the automatic document feeder of the scanner and realize that you are deep in ... um... crap only as you hear the sounds of the vinyl disks breaking inside the stupid machine. Even if the photocopier were not equipped with an ADF, and you had to place all the originals on its glass flatbed manually, hardly you would receive your fair share of praise when you sent the scans to your daughter.
The scanner doesn't care what you put into it - as long as it fits inside. It does its best, but the result is not up to the expectations. However, had you first taken the vinyl records to an experienced musician who would write them down as musical score, scanning those sheets would result in real delight of your child.
In C++, different types may differ to an extent that a printed sheet of paper differs from a CD. A C++ function expecting to receive an array of shorts will process any sequence of bytes/bits as an array of shorts. It doesn't care that the memory area is actually filled with values of a different type, having a completely different representation, just like the scanner didn't care about the contents of the stack on the ADF. Assuming that a function will internally convert each element of the array from double to short, is the same as believing that a photocopier includes a gramophone and a musician that will automatically transcribe vinyl recordings to sheet form. Note that the latter is a possible design for a real-world photocopier, and some other programming languages work like that. But not existing implementations of1 C++.
1 In theory, a standard compliant implementation of C/C++ is possible that would interpret all provisions of UB in the language in favor of the opposite answer to your question, rather than in favor of best performance. But that would make little sense for a language like C/C++.

How come console prints only 1 unique address of integer variable when using address-of operator?

I am a university student currently studying computer science and programming and while reading chapter 2 of c++ primer by Stanley B. Lippmann a question popped up into my mind and that is, if computer memory is divided into tiny storage locations called Bytes (8 bits) and each Byte of memory is assigned a unique address, and an integer variable uses up 4 Bytes of memory, shouldn't my console, when using the address-of operator print out 4 unique addresses instead of 1?
I doubt that the textbook is incorrect and that their is a flaw in my understanding of computer memory. As a result, I would like a positive clarification of this question I am facing. Thanks in advance people :)
shouldn't my console, when using the address-of operator print out 4 unique addresses instead of 1?
No.
The address of an object is the address of its starting byte. A 4-byte int has a unique address, the address of its first byte, but it occupies the next three bytes as well. Those next three bytes have different addresses, but they are not the address of the int.
Each variable is located in memory somewhere, so each variable gets an address you can get with the address-of operator.
That each byte in a multi-byte variable also have their addresses doesn't matter, the address-of operator gives you a pointer to the variable.
Some "graphics" to hopefully explain it...
Lets say we have an int variable named i, and that the type int takes four bytes (32 bits, this is the usual for int). Then you have something like
+---+---+---+---+
| | | | |
+---+---+---+---+
Some place is reserved for the four bytes, where doesn't matter the compiler will handle all that for you.
Now if you use the address-of operator to get a pointer to the variable i i.e. you do &i, then you have something like
+---+---+---+---+
| | | | |
+---+---+---+---+
^
|
&i
The expression &i points to the memory position where the byte-sequence of the variable begins. It can't possible give you multiple pointers, one for each byte, that's really impossible, and not needed as well.
Yes an integer type requires four bytes. All four bytes are allocated as one block of memory for your integer, where each block has a unique address. This unique address is simply the first byte's address of the block.

What's the size of the function stack in OCaml?

If we do not-tail-recursive functions, ocaml will create a stack and push info inside. And it is possible to get stack overflow error if we recursively call too many times.
So what's the threshold? What's the size of the function stack?
For the bytecode interpreter, the documentation says the default size is 256k words. (I think wordsize is 32 or 64 bits depending on the system.) You can adjust it with the l parameter in OCAMLRUNPARAM or through the GC module.
For native code, the documentation says that the native conventions of the OS are used. So it will be different for each implementation.
I just looked these things up now; I've never needed to know in practice. Generally I don't want to write code that gets anywhere near the stacksize limit.
I don't know this for sure, but it is clear that the recursion depth is dependent on the function you are talking about. Simply consider these two (non-tail-recursive) functions:
let rec f x = print_int x; print_char '\n'; 1 + f (x+1);;
let rec g x y z = print_int x; print_char '\n'; 1 + g (x+1) y z;;
And try f 0 resp. g 0 0 0. Both functions will eventually produce a stack overflow, but the latter (g) will do so "earlier".
It may be the case that there is a certain number of bytes available on the stack. You can probably approximate this number by looking at how far f goes and looking up what exactly is pushed onto the stack when a function call occurs.

Checking record size in ocaml?

Is there any way to check the size of a record in Ocaml? Something like sizeof of C/C++?
Yes:
# Obj.size (Obj.repr (1,2,3,4,5)) ;;
- : int = 5
But for a record type, the size only depends on the type declaration, so you could just infer it from that.
The actual size occupied in memory is the number returned by Obj.size plus one in words. Words are 32 or 64 bit depending which OCaml version you are using. The additional word is used for book-keeping.
Besides Obj module, there is also a Objsize library from Dmitry Grebeniuk ( http://forge.ocamlcore.org/projects/objsize/ ). It allows you to get more detailed info about values and its size.