I wrote a small program:
#include <iostream>
using namespace std;
int raw(int &x) {
cout<<x<<endl;
cout<<&x<<endl;
}
int main () {
int s= 1;
int &z=s;
raw(s);
raw(z);
return 0;
}
The output is(as expected):
1
0x7fff5ed36894
1
0x7fff5ed36894
It works as I expect it to be but I am curious about how this is implemented internally. Is it function overloading or something else or one of the function is a wrapper around the other function to provide user-friendliness or the compiler does casting while on its own?
This is how it looks in assembler:
int s= 1;
002044A8 mov dword ptr [s],1
int &z=s;
002044AF lea eax,[s]
002044B2 mov dword ptr [z],eax
raw(s);
002044B5 lea eax,[s]
002044B8 push eax
002044B9 call raw (020110Eh)
002044BE add esp,4
raw(z);
002044C1 mov eax,dword ptr [z]
002044C4 push eax
002044C5 call raw (020110Eh)
LEA (in lea eax,[s]) means Load Effective Address so you can see how z effectively contains a pointer to location of s.
push instructions that prepare the arguments before the function call clearly show that you get (the same) pointer as an input in both cases.
This is non-optimized code.
When the compiler produces code for your program, when it sees the & saying that this is a reference, it really produces a pointer variable [or something, in machine code, that resembles a pointer].
So z as such will hold the address of s.
When you call raw(s), the compiler says "Ah, so the parameter to raw is a reference which means the address of s". When you do raw(z), the compiler says "Ah, we have a reference already, so lets just pass the content of z", which since you set it to s earlier, is the same address as s.
This is exactly as it should be.
Internally this
int s= 1;
int &z=s;
raw(s);
raw(z);
Is optimized to this:
int s = 1;
raw(s);
raw(s);
Because after you do int &z = s; variable z will be aliased to s to end end of its lifetime. So basically it will be the same as s.
Related
This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 2 years ago.
When the object of std::function is destroyed?
Why the pointer to the object of std::function is still valid when the variable fn1 is out of the scope(you see the code snippet works well, http://cpp.sh/6nnd4)?
// function example
#include <iostream> // std::cout
#include <functional> // std::function, std::negate
// a function:
int half(int x) {return x/2;}
int main () {
std::function<int(int)> *pFun;
{
std::function<int(int)> fn1 = half; // function
pFun= &fn1;
std::cout << "fn1(60): " << (*pFun)(60) << '\n';
}
std::cout << "fn1(60): " << (*pFun)(90) << '\n';
return 0;
}
The short answer is that C++ doesn't validate the contents of a pointer. That's the developer's responsibility. It only validates that the variable pFun is in scope.
In C++ it is often the developers responsibility to make sure that their pointers are pointing to valid objects. In this case, fn1 has likely been created on the stack. Since many compilers implement local variables that way. When fn1goes out of scope the compiler will no longer allow the use of the variable fn1. However, what happens to the memory on the stack backing fn1 is not defined. In your case, the compiler left it untouched so (*pFun)(90) happens to still work. However, running the same code on another compiler may not. In fact, simply turning on compiler optimization may cause it to stop working. It all depends on whether the compiler re-uses that memory, or frees it from the stack.
In your example the scope is a code block. If the scope had instead been a separate function named MyFunction2then when MyFunction2exited the stack memory associated with fn1 would have been freed off the stack and the memory available for reuse. Both by interrupts and whatever code was called after MyFunction2. So the data would be more likely to have been changed such that (*pFun)(90) faulted.
Now debugging something like this crashing is fairly strait forward. Imagine if you had written to pFun instead, this would have caused memory corruption of what ever object happened to be using that memory after fn1 had gone out of scope.
Why the pointer to the object of std::function is still valid when the variable fn1 is out of the scope?
Let me present a simpler example, using ints. But if you are brave, you can try to read the assembler for the std::function version.
int main () {
int a = 0;
int *c = nullptr;
{
int b = 1;
c = &b;
}
a = *c;
return a;
}
This is generated with gcc 10.2 -O0, but the other compilers have a really similar output. I will comment it to aid the understanding.
main:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 0 # space for `a`
mov QWORD PTR [rbp-16], 0 # space for `c`
mov DWORD PTR [rbp-20], 1 # space for `b`
lea rax, [rbp-20] # int *tmp = &b Not really, but enough for us
mov QWORD PTR [rbp-16], rax # c = tmp
mov rax, QWORD PTR [rbp-16]
mov eax, DWORD PTR [rax] # int tmp2 = *tmp
mov DWORD PTR [rbp-4], eax # a = tmp2
mov eax, DWORD PTR [rbp-4]
pop rbp
ret # return a
And the program return 1, as expected when you see the assembler. You can see, b was not "invalidated", because we did not roll the stack back, and we didnt change its value. This will be a case like the one you are in, were it works.
Now lets enable optimizations. Here is it compiled with -O1, the lowest level.
Here is it compiled with gcc:
main:
mov eax, 0
ret
And here is it compiled with clang:
main: # #main
mov eax, 1
ret
And here is with visual studio:
main PROC ; COMDAT
mov eax, 1
ret 0
main ENDP
You can see the results are diferent. It is undefined behaviour, each implementation will do as it sees fit.
Now this is happening with some local variables in the same function. But consider this cases:
It was a pointer, lets say allocated with new. You already called delete. What guaranties that that memory is not used by someone else now?
When the stack grows, the value will eventually be overiden. Consider this code:
int* foo() {
int a = 0;
return &a;
}
void bar() {
int b = 1;
}
int main () {
int *ptr = foo();
bar();
int a = *ptr;
return a;
}
It didnt return 1 or 0. It returned 139.
And here is a good read on this.
This question might looks subjective, but actually, I am looking for an objective reason this, like the technical and logical part.
So let's declare an alias for the variable x:
int x = 333;
int &xx = x;
Why not just use x and save the bother of creating the alias/reference?
In addition to what gsamaras said reference can be use when you want to modify something
std::vector<int> myvector{ 1,2,3,4,5 };
for (auto& elem : myvector) ++elem;
Let's not declare it. You just go with x in that case.
However, when you are dealing with large objects, such as huge classes, or even titanic vectors, you do not want to copy the whole thing, that's why you pass a reference to a function that needs that class/object/whatever.
Passing or returning an object as function argument by reference instead of by value
is probably the most important reason. For more, please read Why should I use reference variables at all?
Additionally to the function call and loop usage described in the other answers I also like to use references to avoid repeatedly dereferencing deeply nested structures.
So instead of writing
collection.container[i].subcontainer->item.func_a();
collection.container[i].subcontainer->item.func_b();
collection.container[i].subcontainer->item.data = d;
I sometimes use
auto & cur_item = collection.container[i].subcontainer->item;
cur_item.func_a();
cur_item.func_b();
cur_item.data = d;
You're looking for a good reason to use a reference instead of an object in block scope?
References bind to objects and allow an alternative means to identify them. The variable that holds 333 in your example, just has two names in essence. If you look at the generated assembly, you probably won't even see an allocation for anything besides the object. For example, this code:
#include <iostream>
#include <cstdlib>
#include <ctime>
int main()
{
std::srand(std::time(NULL));
int x = std::rand();
int &xx = x;
++xx;
std::cout << xx;
return 0;
}
Produces this assembly when optimizations are enabled:
main:
sub rsp, 8
mov edi, 0
call time
mov edi, eax
call srand
call rand
lea esi, [rax+1]
mov edi, OFFSET FLAT:std::cout
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
mov eax, 0
add rsp, 8
ret
_GLOBAL__sub_I_main:
sub rsp, 8
mov edi, OFFSET FLAT:std::__ioinit
call std::ios_base::Init::Init()
mov edx, OFFSET FLAT:__dso_handle
mov esi, OFFSET FLAT:std::__ioinit
mov edi, OFFSET FLAT:std::ios_base::Init::~Init()
call __cxa_atexit
add rsp, 8
ret
If you examine this at the live example and see what code maps to where, you'd notice there is absolutely no representation of the reference remaining. All access is to the object.
So if a reference at block scope is just a fancy way to give another name to an object, it isn't all that useful. But what if the object didn't have a name to begin with?
For instance:
std::string foo() { return "Hello World!"; }
int main() {
foo();
}
When I call foo it returns a temporary object where the result is stored. That object is nameless. I can either use it directly or it's gone the nano-second that ; is reached after the call. Sure, I can copy it:
std::string res = foo();
But what if the string is very long, and I just want to print it a few times. Why do I have to waste my time with copies?
Turns out you don't have to copy:
std::string const &res = foo();
The above is not a dangling reference! A const reference can bind to a temporary object. And the C++ language promises that object will live for as long as the reference does in that block scope. Essentially, we can name the object and make it usable after the call, thus saving a copy.
I have problem with inline asm in C++. I'm trying to implement fast strlen, but it is not working - when I use __declspec(naked) keyword debugger shows address of input as 0x000000, when I don't use that keyword, eax is pointing for some trash, and function returns various values.
Here's code:
int fastStrlen(char *input) // I know that function does not calculate strlen
{ // properly, but I just want to know why it crashes
_asm // access violation when I try to write to variable x
{
mov ecx, dword ptr input
xor eax, eax
start:
mov bx, [ecx]
cmp bl, '\0'
je Sxend
inc eax
cmp bh, '\0'
je Sxend
inc eax
add ecx, 2
jmp start
Sxend:
ret
}
}
int _tmain(int argc, _TCHAR* argv[])
{
char* test = "test";
int x = fastStrlen(test);
cout << x;
return 0;
}
can anybody point me out what am I doing wrong?
Don't use __declspec(naked) since in that case the complier doesn't generate epilogue and prologue instructions and you need to generate a prologue just like compiler expects you to if you want to access the argument fastStrlen. Since you don't know what the compiler expects you should just let it generate the prologue.
This means you can't just use ret to return to the caller because this means you're supplying your own epilogue. Since you don't know what prologue the compiler used, you don't know what epilogue you need implement to reverse it. Instead assign the return value to a C variable you declare inside the function before the inline assembly statement and return that variable in a normal C return statement. For example:
int fastStrlen(char *input)
{
int retval;
_asm
{
mov ecx, dword ptr input
...
Sxend:
mov retval,eax
}
return retval;
}
As noted in your comments your code will not be able to improve on the strlen implementation in your compiler's runtime library. It also reads past the end of strings of even lengths, which will cause a memory fault if the byte past the end of a string isn't mapped into memory.
Hi I know this is a very silly/basic question, but what is the difference between the code like:-
int *i;
for(j=0;j<10;j++)
{
i = static_cast<int *>(getNthCon(j));
i->xyz
}
and, some thing like this :-
for(j=0;j<10;j++)
{
int *i = static_cast<int *>(getNthCon(j));
i->xyz;
}
I mean, are these code extremely same in logic, or would there be any difference due to its local nature ?
One practical difference is the scope of i. In the first case, i continues to exist after the final iteration of the loop. In the second it does not.
There may be some case where you want to know the value of i after all of the computation is complete. In that case, use the second pattern.
A less practical difference is the nature of the = token in each case. In the first example i = ... indicates assignment. In the second example, int *i = ... indicates initialization. Some types (but not int* nor fp_ContainerObject*) might treat assignment and initialization differently.
There is very little difference between them.
In the first code sample, i is declared outside the loop, so you're re-using the same pointer variable on each iteration. In the second, i is local to the body of the loop.
Since i is not used outside the loop, and the value assigned to it in one iteration is not used in future iterations, it's better style to declare it locally, as in the second sample.
Incidentally, i is a bad name for a pointer variable; it's usually used for int variables, particularly ones used in for loops.
For any sane optimizing compiler there will be no difference in terms of memory allocation. The only difference will be the scope of i. Here is a sample program (and yes, I realize there is a leak here):
#include <iostream>
int *get_some_data(int value) {
return new int(value);
}
int main(int argc, char *argv[]){
int *p;
for(int i = 0; i < 10; ++i) {
p = get_some_data(i);
std::cout << *p;
}
return 0;
}
And the generated assembly output:
int main(int argc, char *argv[]){
01091000 push esi
01091001 push edi
int *p;
for(int i = 0; i < 10; ++i) {
01091002 mov edi,dword ptr [__imp_operator new (10920A8h)]
01091008 xor esi,esi
0109100A lea ebx,[ebx]
p = get_some_data(i);
01091010 push 4
01091012 call edi
01091014 add esp,4
01091017 test eax,eax
01091019 je main+1Fh (109101Fh)
0109101B mov dword ptr [eax],esi
0109101D jmp main+21h (1091021h)
0109101F xor eax,eax
std::cout << *p;
01091021 mov eax,dword ptr [eax]
01091023 mov ecx,dword ptr [__imp_std::cout (1092048h)]
01091029 push eax
0109102A call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (1092044h)]
01091030 inc esi
01091031 cmp esi,0Ah
01091034 jl main+10h (1091010h)
}
Now with the pointer declared inside of the loop:
int main(int argc, char *argv[]){
008D1000 push esi
008D1001 push edi
for(int i = 0; i < 10; ++i) {
008D1002 mov edi,dword ptr [__imp_operator new (8D20A8h)]
008D1008 xor esi,esi
008D100A lea ebx,[ebx]
int *p = get_some_data(i);
008D1010 push 4
008D1012 call edi
008D1014 add esp,4
008D1017 test eax,eax
008D1019 je main+1Fh (8D101Fh)
008D101B mov dword ptr [eax],esi
008D101D jmp main+21h (8D1021h)
008D101F xor eax,eax
std::cout << *p;
008D1021 mov eax,dword ptr [eax]
008D1023 mov ecx,dword ptr [__imp_std::cout (8D2048h)]
008D1029 push eax
008D102A call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (8D2044h)]
008D1030 inc esi
008D1031 cmp esi,0Ah
008D1034 jl main+10h (8D1010h)
}
As you can see, the output is identical. Note that, even in a debug build, the assembly remains identical.
Ed S. shows that most compilers will generate the same code for both cases. But, as Mahesh points out, they're not actually identical (even beyond the obvious fact that it would be legal to use i outside the loop scope in version 1 but not version 2). Let me try to explain how these can both be true, in a way that isn't misleading.
First, where does the storage for i come from?
The standard is silent on this—as long as storage is available for the entire lifetime of the scope of i, it can be anywhere the compiler likes. But the typical way to deal with local variables (technically, variables with automatic storage duration) is to expand the stack frame of the appropriate scope by sizeof(i) bytes, and store i as an offset into that stack frame.
A "teaching compiler" might always create a stack frame for each scope. But a real compiler usually won't bother, especially if nothing happens on entering and exiting the loop scope. (There's no way you can tell the difference, except by looking at the assembly or breaking in with a debugger, so of course it's allowed to do this.) So, both versions will probably end up with i referring to the exact same offset from the function's stack frame. (Actually, it's quite plausible i will end up in a register, but that doesn't change anything important here.)
Now let's look at the lifecycle.
In the first case, the compiler has to default-initialize i where it's declared at the function scope, copy-assign into it each time through the loop, and destroy it at the end of the function scope. In the second case, the compiler has to copy-initialize i at the start of each loop, and destroy it at the end of each loop. Like this:
If i were of class type, this would be a very significant difference. (See below if it's not obvious why.) But it's not, it's a pointer. This means default initialization and destruction are both no-ops, and copy-initialization and copy-assignment are identical.
So, the lifecycle-management code will be identical in both cases: it's a copy once each time through the loop, and that's it.
In other words, the storage is allowed to be, and probably will be, the same; the lifecycle management is required to be the same.
I promised to come back to why these would be different if i were of class type.
Compare this pseudocode:
i.IType();
for(j=0;j<10;j++) {
i.operator=(static_cast<IType>(getNthCon(j));
}
i.~IType();
to this:
for(j=0;j<10;j++) {
i.IType(static_cast<IType>(getNthCon(j));
i.~IType();
}
At first glance, the first version looks "better", because it's 1 IType(), 10 operator=(IType&), and 1 ~IType(), while the second is 10 IType(IType&) and 10 ~IType(). And for some classes, this might be true. But if you think about how operator= works, it usually has to do at least the equivalent of a copy construction and a destruction.
So the real difference here is that the first version requires a default constructor and a copy-assignment operator, while the second doesn't. And if you take out that static_cast bit (so we're talking about a conversion constructor and assignment instead of copy), what you're looking at is equivalent to this:
for(j=0;j<10;j++) {
std::ifstream i(filenames[j]);
}
Clearly, you would try to pull i out of the loop in that case.
But again, this is only true for "most" classes; you can easily design a class for which version 2 is ridiculously bad and version 1 makes more sense.
For every iteration, in second case, a new pointer variable is created on the stack. While in the first case, the pointer variable is created only once(i.e., before entering the loop )
I'm new to assembly coding and embedding it in C++, the thing I'm trying to do is add the integers in an array using assembly. This is the code i have so far:
#include <iostream>
#include <stdio.h>
int x [] = {5,4,3,2,1};
int sumArray(int [5]);
int main()
{
sumArray(x);
printf_s("The sum of the array is %d");
}
int sumArray(int [5])
{
__asm
{
mov edi,OFFSET sumArray
mov ecx,5
mov eax,0
L1:
add eax,[edi]
add edi, TYPE sumArray
loop L1
}
}
An original problem I was having with was with mov ecx I had it as
mov ecx,LENGTHOF sumArray
but it wouldn't compile so I changed it to 5 and it compiled. So now when I run the program it breaks. I used F11 in Visual Studio to go line by line to see at what line the program breaks, and program breaks when its going through the loop a second time.
So if anyone can help me figure out how I can go about fixing it I would appreciate it.
It seems to me you have it broken quite a bit. First of all, you have a function named sumArray with an unnamed argument. But inside the function, you are referring to sumArray as if it were the name of the array argument. Then, you need to understand the way C(++) passes arrays as arguments: they are (always) passed by reference, as a pointer to the first array member. And it also means the function (in general) does not know the length of the array (unless you set it to a fixed-size). So, you usually pass the length in another argument. Which means we have the following:
int sumArray(int arr[], int len)
{
__asm
{
mov edi, arr
mov ecx, len
xor eax, eax
L1:
add eax, [edi]
add edi, 4
loop L1
}
}
Note that we are not trying to get an offset of the array, that would get us to the pointer, we need to get the value of the pointer, i.e. the address of the first array item. Also, note I have hardcoded the element size (4), there is no point acting like we can work with anything, if at the previous line, we add 32-bit words. (The xor eax, eax is just another way to set a register to zero, to be honest, in today’s CPUs, I am not sure if it is faster or not.)
And when testing this, do not forget to actually pass the result to the printf_s…
The problem with your code seems to be you are using your sumArray function name instead of your actual array x, and thats why it crashes.
Isn't your asm supposed to look like these:
__asm
{
mov edi,OFFSET x
mov ecx,LENGTHOF x
mov eax,0
L1:
add eax,[edi]
add edi, TYPE x
loop L1
}
? (here I assume, that you are not mistaken about macro usage, as I never compiled anything using MASM, that seems to be used here, but I think you got the idea)
The another question is why you pass unnamed argument to sumArray if you don't actually use it, you'd better then pass the array as a named argument and it's length and make use of them in your assembly code.