Where are programming language functions in memory located? [closed]

Where are programming language functions in memory located? [closed] - c++

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
When a function is called, lets say in c++, is it located in a specific place in memory since function pointers exist? If so where exactly? How about classes? Is there memory stored aside for class definitions?

Yes, functions are located in a specific place in memory. In the context of Virtual Memory (opposed to physical caches), they are stored below the Heap and below a section called the Data (global variables) in a section called Text. All of this is loaded up when the executable is read; this is all done in binary, which is one-to-one with assembly, so you'll never see this in your C code. However, if you know the processor well, you can sometimes still manipulate it into reading from the code section in your code. It may cause a segfault however, and generally you cannot write to the code section.
Just like pointers to variables, function pointers point to a place on the overall stack (see this helpful site). There is actually a register devoted to pointing to exactly which instruction the program is currently executing.
Class definitions and member functions also have a specific place on the stack; I'm not entirely sure, but I believe they go in Data.

All of this may be wrong, it is not my specialty. But as far as I know...
A function at runtime is a position in the executable that a "lower" part can call, altering the stack... never mind, I'll not try to explain this any further.
A class is not stored in memory. It is completely conceptual. Say you have the following structure.
struct idk
{
char* name;
int index;
void* data;
};
Well, new idk at runtime doesn't actually look at some kind of definition to know what to allocate. Instead the compiler figures everything out so that the end result is that new idk turns out to be the conceptual equivalent of new char [sizeof(idk)], though that is not taking into account alignment and packing. Anyway, so references also don't have any form of table to look at for what variables are where, they rather are also determined at compile-time, so that int n = idk_thing.index would perhaps act like int n = *((int*)(&idk_thing + sizeof(char*)); and so on.
And of course, a class's storage is implemented almost identically to that of a structure, and any class-specific functions are just plain old functions that, again, the compiler sets up a certain way so that it modifies variables in the class by accessing the storage of an instance of the class. I assume that this is done by passing a pointer to the storage to the function, which accesses that block of memory with offsets depending on what variable it is working with (just the same as what I was saying about structures).
Now, as for function pointers, assuming I am at least on the right track with the compiled equivalent of functions, I'd say that function pointers are just numbers to represent the location in the loaded executable that is the starting place for a function, just as a char* is a number meant to represent the location of a char in memory.

As for classes, in C++ (and all OOP languages, if that matters) they are typically created on the heap. Though C++ can create it on the stack if you ommit the new keyword, but that's generally not recommended because classes tend to be resource-heavy, but it means that you'll have a memory leak if you don't explicitly delete it.
For function pointers, they're usually just pointers in the stack pointing to seperate code blocks in read only memory.

Related

Visible variables in c++ and how to make variable visible more [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am still a beginner in c++, but I know something. I am studying the 1st term and I wanna make my own project, IMO it's the best way to learn to program. Anyway I wanna load data from file to dynamic array (and I know how to do that) but I to that job be done by special function and to that array be visible for other function (alternativity global). I know that using global variables is not good idea so I am thinking if it's possible to make variable friend with NO classes (bc I didn't use and learn classes yet)
Thanks in advance!

friend is not what you're looking for. A variable is just a named object. What you want to do here is not to somehow access the function's variable from the outside (that's not actually possible, function variables only exist when the function is executing). You want to transfer the object from one function to th other. That's done through the function's return value:
std::vector<int> readDataFromFile() {
std::vector<int> data;
// Read the file and store it into `data`
return data;
}
int main() {
std::vector<int> myData = readDataFromFile();
// Use `myData` as needed
}
You can see above that readDataFromFile works on its data variable, then returns it. This means that, right as readDataFromFile ends, myData in main (another, independent object) is initialized from data, and the data itself lives on.
Notes:
Do not use C-style arrays, new or delete. These are meant for compatibility with C and low-level memory management, not general use. A C++ dynamic array is an std::vector<YourType>.
Further notions:
Here myData is move-initialized, which means that no copy of the data is made: the dynamic array is transferred directly from data to myData
This is a case where NRVO can occur. That's an optimization which notices that data is redundant, and will replace it with direct access to myData, so there will only ever be one vector object throughout the program's execution. This is not observable in the general case., wo you don't need to worry about it.

Why is constructor always called on the stack? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
For some reason, I have decided to never use dynamic memory allocation in my program. This means all the variables in my program are static and the "new" constructor is never used. But the following code crashes, and produces a stack overflow exception:
VeryLargeObject x; // Global variable -> static memory
void ResetTheObject()
{
x = VeryLargeObject();
}
Obviously, all I want to do is give x the default value of VeryLargeObject, which is a structure containing a lot of different variables, with their own constructors of varying complexity (so there's quite some initialization work to do). But here the language/compiler has decided that this should happen on the stack before being copied, and since VeryLargeObject is too large for the stack, my program crashes.
However I have found the solution to this problem:
VeryLargeObject x;
void ResetTheObject()
{
new (&x) VeryLargeObject();
}
I had never heard of this before, yet it does exactly what I want. This is a "placement new". It calls the constructor on the already allocated (or simply static) memory provided by a pointer.
My question, since I have the solution, is a rant: Why isn't this the default behavior of the first code ? If there isn't a less hacky way of doing this (i.e without the word "new" having anything to do with it), then why? Also why does it send me back the pointer, even though I just provided it? I thought C++ was a great language, but this seems kind of ugly and not very well-thought-out.

First of all, turning on optimization might get you what you want with the first syntax. Without it, here's what you asked the compiler to do:
Create a temporary object of type VeryLargeObject.
Assign that into a global variable called x.
Since temporary objects need storage, the compiler allocates them on the stack. What the compiler is doing is, literally, what you asked the compiler to do.
The compiler may, if optimizations are turned on, understand that what the sequence is and save the copy. This requires that the compiler can positively prove to itself that the old value of x will not get in the way in any way. Since you admit that the initialization is quite complex, you can forgive the compiler if it did not manage to do so.
You have two options. You can either create an in-place initialization function and call that instead of the constructor, or you can use placement new, like you did.
The danger with placement new, as you used it, is that it replaces the old value of x without properly destructing it. It simply assumes that x is uninitialized. If that's okay for your use, then go ahead and use it. The compiler, for its part, is not allowed to assume that.

How to know the virtual address of the data & code in C and C++? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
How can one get the virtual address of the data & code in a program?
One might say use %u or %p or something else.
printf("%u", &data);
printf("%p", &data);
I'm always confused; which one gives correct address? Both give addresses but what's the difference?
Is there any way we can say which part of memory a given virtual address belongs to? Can we identify that it's a stack address or a heap address or something else?

For (1) Only printf("%p", &data) can be used to print the pointer address although you must write (void*)&data (C-standard ref C11, 7.21.6.1p8), the behaviour of printf("%u", &data) is undefined as the format specifier is invalid for a pointer type. But note that the address you see may well not have any correspondence to a physical address; many operating systems and runtimes place one or two levels of abstraction between physical addresses and the pointer values you see.
For (2), the printf call is also valid in C++.
For (3), neither the C nor the C++ standard (aside from a couple of standard library functions in the latter) have a notion of a stack or a heap, so, no, there is no portable way of telling.

From the rest of your questions, it suggests you are trying to use C++ to identify pedagogical concepts that have no real existence in running code.
How can one get the virtual address of the data & code in a program?
The linker assembles code and data into program segments. There can be multiple program segments containing either. However, the default is usually to have one of each. If you want to find that information, you need to create a linker MAP file as part of your program build and read that.
As you want to do that in the code, you would need to write something that parsed the contents of the executable file.
Your operating system may have system services that you can use to inspect pages to see (a) if they are valid and (b) what their attributes are. From that you could determine where code resides.
I'm always confused; which one gives correct address? Both give addresses but what's the difference?
%p is correct for pointers. %u is correct for unsigned integers. On most systems they are effectively the same. However, on some they are not (i.e., sizeof (int) != sizeof (int*), or the pointer has a weird format as on segmented intel).
Use %p for pointers.
How can we get the same virtual address details in C++? (Since it is downward compatible with C, we can use same thing, but is there any other way?)
Is there any way we can say which part of memory a given virtual address belongs to? Can we identify that it's a stack address or a heap address or something else?
Memory is just memory. As a teaching tool, memory is often described in terms of data/heap/stack. That does not exist in reality. A heap and a stack are simply blocks of memory that are managed in different ways. A heap can be a stack.

You may not use %u specifier to print a pointer. That specifier is for unsigned int. %p is the correct format specifier for pointers. The difference is that using %u is technically undefined behaviour because the argument is of a different type than is required. Furthermore, you must cast &data to void*.
Indeed, printf is in the c library portion that is included in c++ standard library so you may use it. But if you prefer c++ streams, then you can use std::cout << &data;
That is not possible in standard c++ (nor in c).
In neither standard, is there is distinction between virtual and physical memory. There is just memory. Whatever addresses of that memory represent is specified by the OS. If the program does not run in an OS that uses virtual memory, then the addresses may physical.

Usage of void pointers in C++ [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Is there ever a time they're necessary or are they a symptom of poor code design?
If the former, can you give an example?
If the latter, can you explain the dangers of using them?

Void pointers can be viewed as a neutral way of passing things around. A void pointer can point at whatever you want unlike a non-void one.
Example:
Imagine you have an integer variable
int myvar;
There are two ways you can tackle this variable if you want to pass it along somewhere using a pointer:
int *myvar_ptr = &myvar;
or
void *myvar_ptr = &myvar;
In the first case using a integer pointer restricts you to pointing at only a memory block containing an integer. In the second case you might as well have a float, double, an object, the beginning of an array of some sort, std container etc.
The downside is that even though you have this neutral way of passing things around you still have to retrieve the thing that the void pointer is pointing at at some point, which means that you need to cast it to the respective type. This is a very tricky part and if you do it incorrectly you can 1)create segmentations faults (imagine your void pointer is pointing at an array of chars but you cast it back to a double -> memory-wise you jump "further" thus the probability of accessing a memory you are not supposed to is pretty big), 2)corrupt your data etc.
Another downside of the void pointers is the absence of pointer arithmetic. You cannot do:
myvar_ptr++;
if myvar_ptr is a void pointer since you don't know what the +1 indicates memory-wise. Might be a char, might be a big fat object or else.
In C++ most people prefer using pointers combined with templates since you still get a higher degree of flexibility compared to a primitive type pointer yet the information about the type is not lost. Mimicing void pointers for classes using templates combined with pointers is not that difficult especially since we have things inheritance. Void can also be used as a type for a template (see here).
EDIT:
Sorry about misreading your question's title. Since I don't want my writing to go to waste here is the old post.
Some extra information on pointers:
Pointers are useful for
talking to C applications (especially those using UI APIs where event handling of the various components is usually (always?) done using pointers to functions)
dynamic memory allocation including malloc (and similar oldies but goldies), new operator (C++; new actually returns a pointer if you didn't know) etc.
pointer arithmetic - performance and flexibility
embedded software development - pointers allow you a very precise access to the memory (which is also essential for the *nix systems, where C is the widely used standard language)
generally offer extra flexibility and reusability of components - unlike some other language C/C++ allows you to pass things by value and by reference, which means that you are not obligated to copy stuff (basically what pass by value does) if you don't want to. Since everything has a starting point in the memory you can even pass functions using pointers (as I've mentioned a very popular way of doing UI components' callbacks)
Pointers can also be a pain in the butt:
many books/tutorials do a poor job in explaining in an easy-to-understand manner how you can screw things up (pretty badly on top of that) if you use pointers incorrectly
stacking pointer of a pointer of a pointer of a pointer ... can lead to an extremely obfuscated (meaning hard to read and understand) code
where there is dynamic memory allocation involved there is always a chance you miss something and there is you big fat memory leak. But then again you can assign something on the stack to a pointer. ;)
In C++ as mentioned by some here manages to hide a lot of the pointer stuff (maybe exactly in order to prevent developers misusing pointers) but you still need it depending on what you are doing.

void pointers are pointers to data of unknown type.
For example, you can write a sorting function for sorting data. You do not need to know the data type (numeric, ASCII text, Chinese, a double precision number or something else), the algorithm is always the same. If you pass in a void pointer to the data and a pointer to a data comparison function you will be able to sort the data.
Another example of using a void pointer would be a data compression function. The compression function doesn't care about the data type. It only needs to know the start of the data and the size of the data.

is it possible to use function pointers this way?

This is something that recently crossed my mind, quoting from wikipedia: "To initialize a function pointer, you must give it the address of a function in your program."
So, I can't make it point to an arbitrary memory address but what if i overwrite the memory at the address of the function with a piece of data the same size as before and than invoke it via pointer ? If such data corresponds to an actual function and the two functions have matching signatures the latter should be invoked instead of the first.
Is it theoretically possible ?
I apologize if this is impossible due to some very obvious reason that i should be aware of.

If you're writing something like a JIT, which generates native code on the fly, then yes you could do all of those things.
However, in order to generate native code you obviously need to know some implementation details of the system you're on, including how its function pointers work and what special measures need to be taken for executable code. For one example, on some systems after modifying memory containing code you need to flush the instruction cache before you can safely execute the new code. You can't do any of this portably using standard C or C++.
You might find when you come to overwrite the function, that you can only do it for functions that your program generated at runtime. Functions that are part of the running executable are liable to be marked write-protected by the OS.

The issue you may run into is the Data Execution Prevention. It tries to keep you from executing data as code or allowing code to be written to like data. You can turn it off on Windows. Some compilers/oses may also place code into const-like sections of memory that the OS/hardware protect. The standard says nothing about what should or should not work when you write an array of bytes to a memory location and then call a function that includes jmping to that location. It's all dependent on your hardware and your OS.

While the standard does not provide any guarantees as of what would happen if you make a function pointer that does not refer to a function, in real life and in your particular implementation and knowing the platform you may be able to do that with raw data.
I have seen example programs that created a char array with the appropriate binary code and have it execute by doing careful casting of pointers. So in practice, and in a non-portable way you can achieve that behavior.

It is possible, with caveats given in other answers. You definitely do not want to overwrite memory at some existing function's address with custom code, though. Not only is typically executable memory not writeable, but you have no guarantees as to how the compiler might have used that code. For all you know, the code may be shared by many functions that you think you're not modifying.
So, what you need to do is:
Allocate one or more memory pages from the system.
Write your custom machine code into them.
Mark the pages as non-writable and executable.
Run the code, and there's two ways of doing it:
Cast the address of the pages you got in #1 to a function pointer, and call the pointer.
Execute the code in another thread. You're passing the pointer to code directly to a system API or framework function that starts the thread.

Your question is confusingly worded.
You can reassign function pointers and you can assign them to null. Same with member pointers. Unless you declare them const, you can reassign them and yes the new function will be called instead. You can also assign them to null. The signatures must match exactly. Use std::function instead.
You cannot "overwrite the memory at the address of a function". You probably can indeed do it some way, but just do not. You're writing into your program code and are likely to screw it up badly.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js