Whether variable name in any programming language takes memory space - c++

e.g.
int a=3;//-----------------------(1)
and
int a_long_variable_name_used_instead_of_small_one=3;//-------------(2)
out of (1) and (2) which will acquire more memory space or equal space would be aquire?

In C++ and most statically compiled languages, variable names may take up more space during the compilation process but by run time the names will have been discarded and thus take up no space at all.
In interpreted languages and compiled languages which provide run time introspection/reflection the name may take up more space.
Also, language implementation will affect how much space variable names take up. The implementer may have decided to use a fixed-length buffer for each variable name, in which case each name takes up the same space regardless of length. Or they may have dynamically allocated space based on the length.

Both occupy the same amount of memory. Variable names are just to help you, the programmer, remember what the variable is for, and to help the compiler associate different uses of the same variable. With the exception of debugging symbols, they make no appearance in the compiled code.

The name you give to a variable in C/C++ will not affect the size of the resulting executable code. When you declare a variable like that, the compiler reserves memory space (in the case of an int on x86/x64, four bytes) to store the value. To access or alter the value it will then use the address rather than the variable name (which is lost in the compilation process).

In most interpreted languages, the name would be stored in a table somewhere in memory, thus taking up different amounts of space.

If my understanding is correct, they'll take up the same amount of memory.
I believe (and am ready to get shot down in flames) that in C++ the names are symbolic to help the user and the compiler will just create a block of memory sufficient to hold the type you're declaring, in this case an int.
So, they should both occupy the same memory size, ie, the memory required to hold an address.

For C++,
$ cat name.cpp
int main() {
int a = 74678;
int bcdefghijklmnopqrstuvwxyz = 5664;
}
$ g++ -S name.cpp
$ cat name.s
.file "name.cpp"
.text
.align 2
.globl main
.type main, #function
main:
.LFB2:
pushl %ebp
.LCFI0:
movl %esp, %ebp
.LCFI1:
subl $8, %esp
.LCFI2:
andl $-16, %esp
movl $0, %eax
addl $15, %eax
addl $15, %eax
shrl $4, %eax
sall $4, %eax
subl %eax, %esp
movl $74678, -4(%ebp)
movl $5664, -8(%ebp)
movl $0, %eax
leave
ret
.LFE2:
.size main, .-main
.section .note.GNU-stack,"",#progbits
.ident "GCC: (GNU) 3.4.6 20060404 (Red Hat 3.4.6-11.0.1)"
$
As you can see, neither a nor bcdefghijklmnopqrstuvwxyz reflect in the assembler output. So, the length of the variable name does not matter at runtime in terms of memory.
But, variable names are huge contributors to the program design. Some programmers even rely on good naming conventions instead of comments to explain the design of their program.
A relevant quote from Hacker News,
Code should be written so as to completely describe the program's functionality to human readers, and only incidentally to be interpreted by computers. We have a hard time remembering short names for a long time, and we have a hard time looking at long names over and over again in a row. Additionally, short names carry a higher likelihood of collisions (since the search space is smaller), but are easier to "hold onto" for short periods of reading.
Thus, our conventions for naming things should take into consideration the limitations of the human brain. The length of a variable's name should be proportional to the distance between its definition and its use, and inversely proportional to its frequency of use.

In modern compilers the name of a variable does not impact the amount of space that is required to hold it in C++.

Field names (instance variable names) in Java use memory, but only once per field. This is for reflection to work. The same goes for other languages that are based on the JVM, and I guess for DotNet.

compilers are there for a reason...
They optimize code to use a little space as possible and run as fast as possible especially modern ones.

No. Both will occupy equal space.

Related

Changing a number defined in a C++(C) program without compiling the source again

Suppose I have this simple program which prints a number:
#include <iostream>
int unique_id = 112233;
int main()
{
std::cout << unique_id;
return 0;
}
Then I compile it to something like a.exe. Now I want to create another application that opens a.exe and changes unique_id to something else. Is it possible?
I'm not going to pass a parameter to the program because of some restrictions.
I want to use the unique_id, as its name implies, to uniquely identify where my program is running. But I don't want to compile my program 1000 times for 1000 customers. I know I can use Hard Disk Serial number, but in virtual machines, this serial number may be omitted. I know I can use CPU serial number, But I read in S.O posts that this serial number is deprecated. I know I can use MAC address too :), but that address can be changed easily. So I decided to put the unique ID in exe file itself.
Considering the motivation you added to the question, you could simply make the exe read the id from a .txt file, and ship a different .txt file with the exe for every customer.
Or, equivalently, you could make a DLL (or the equivalent for your platform) that has a function returning the id, and only recompile the DLL for every customer.
In general, you cannot change anything without re-compiling.
In practice and in very limited cases, you might patch your binary. This is mostly processor specific (and executable format specific and ABI specific) and depends less on your particular operating system version (e.g. if it works for Windows 9, it could work for Windows 10).
(However, I don't know and never used Windows; I'm only using Linux; you should adapt my answer to your operating system)
So in some cases you might reverse-engineer your binary executable. If you do have the C source code, you could ask your compiler to emit the assembler code (e.g. by compiling with gcc -O -fverbose-asm -S with GCC). Then you might disassemble your executable, and change, with a binary or hexadecimal editor, the machine code containing that constant.
This won't always work, because the machine instruction (and its size) could depend on the magnitude (bit size) of your constant.
To take a simple example, in C, for GCC 7, on Linux/x86-64, consider the following C file:
/// A, B, C are preprocessor symbols defined as integers
int f(int x) {
if (x > 0)
return A*x + B;
return C;
}
If I compile that with gcc -fverbose-asm -S -O -DA=12751 -DB=32 -DC=11 e.c I'm getting:
.type f, #function
f:
.LFB0:
.cfi_startproc
# e.c:3: if (x > 0)
testl %edi, %edi # x
jle .L3 #,
# e.c:4: return A * x + B;
imull $12751, %edi, %edi #, x, tmp90
leal 32(%rdi), %eax #, <retval>
ret
.L3:
# e.c:5: return C;
movl $11, %eax #, <retval>
# e.c:6: }
ret
.cfi_endproc
.LFE0:
.size f, .-f
But if I do gcc -S -O -fverbose-asm -DA=12753 -DB=32 -DC=10 e.c I'm getting
.type f, #function
f:
.LFB0:
.cfi_startproc
# e.c:3: if (x > 0)
testl %edi, %edi # x
jle .L3 #,
# e.c:4: return A * x + B;
imull $12753, %edi, %edi #, x, tmp90
leal 32(%rdi), %eax #, <retval>
ret
.L3:
# e.c:5: return C;
movl $10, %eax #, <retval>
# e.c:6: }
ret
So indeed, in the above case I could patch the binary (I would need to find the 12751 and 11 constants in machine code; it is doable but tedious in that case).
Now, let's try with A being a small power of two, like 16, and C being 0, so
gcc -S -O -fverbose-asm -DA=16 -DB=32 -DC=0 e.c:
f:
.LFB0:
.cfi_startproc
# e.c:4: return A * x + B;
leal 2(%rdi), %eax #, tmp90
sall $4, %eax #, tmp93
testl %edi, %edi # x
movl $0, %edx #, tmp92
cmovle %edx, %eax # tmp93,, tmp92, <retval>
# e.c:6: }
ret
Because of compiler optimizations, the code changed significantly. It is not easy to patch.
Important notice
With enough effort, money and time (think of NSA-like abilities) a lot of things are possible.
if your goal is to obfuscate some data in your binary (e.g. some password), you might encrypt it to make hackers' life harder (but don't be naive, the NSA will be able to get it). Remember the motto: there is No Silver Bullet; it looks that is your goal, but don't be too naive (BTW, the legal protections around your software, e.g. the license, matters even more; so you need a lawyer to write a good EULA).
If your goal is on the contrary to adapt some performance-critical code, you could use metaprogramming and partial evaluation techniques. A practice I like doing is generate at runtime some temporary C (or C++) code (better suited for your particular situation and data), compile that temporary C or C++ code as some plugin, then dynamically load that temporary plugin (using dlopen and dlsym on Linux; on Windows you'll need LoadLibrary but I leave you to understand the details and consequences). Instead of generating C or C++ code at runtime you could use some JIT compiling library like libgccjit. If you are fond of such techniques, consider instead using better programming languages (like Common Lisp with SBCL) if your management allows them.
But I don't want to compile my program 1000 times for 1000 customers
That surprises me a lot. Compiling a simple (short) C file containing just constants is quick, and linking time is also quick. I would instead consider recompilation for each customer.
BTW, I feel you are incredibly naive. The most important protection is not technical in your binary, it is a legal protection (and you need a good contract, so find and pay a good lawyer).
Did you consider on the contrary to make your product free software? Many companies are doing that (and making money on something else that licenses, e.g. support).
NB. there are lots of existing license managers. Did you consider buying and using one? Notice also that corporations have large incentives to avoid cheating, and those willing to steal your software will be able to do that anyway. You'll sell more products by working on software quality, not by spending efforts on vain "protection" measures which are annoying your customers, increasing your logistics and distribution and maintenance costs, and harden the debugging of customer-found bugs.
No, the behaviour of changing a variable that is const is undefined. So you can't do this with standard C or C++.
Your best bet is to resort to an inline assembly solution; but note that UNIQUE_ID might be compiled out altogether (neither C nor C++ are reflective languages). In order to increase the probability of UNIQUE_ID being retained, remove the const qualifier and possibly introduce volatile.
Personally I'd pass UNIQUE_ID on the command line to your program.
Starting point: https://msdn.microsoft.com/en-us/library/fabdxz08.aspx

VM interpreter - weighting performance benefits and drawbacks of larger instruction set / dispatch loop

I am developing a simple VM and I am in the middle of a crossroad.
My initial goal was to use byte long instruction, and therefore a small loop and a quick computed goto dispatch.
However, turns out reality could not be further from it - 256 is nowhere near enough to cover signed and unsigned 8, 16, 32 and 64bit integers, floats and doubles, pointer operations, the different combinations of addressing. One option was to not implement byte and shorts but the goal is to make a VM that supports the full C subset as well as vector operations, since they are pretty much everywhere anyway, albeit in different implementations.
So I switched to 16bit instruction, so now I am also able to add portable SIMD intrinsics and more compiled common routines that really save on performance by not being interpreted. There is also caching of global addresses, initially compiled as base pointer offsets, the first time an address is compiled it simply overwrites the offset and instruction so that next time it is a direct jump, at the cost of and extra instruction in the set for each use of a global by an instruction.
Since I am not in the stage of profiling, I am in a dilemma, are the extra instructions worth the more flexibility, will the presence of more instructions and therefore the absence of copying back and forth instructions make up for the increased dispatch loop size? Keeping in mind the instructions are just a few assembly instructions each, e.g:
.globl __Z20assign_i8u_reg8_imm8v
.def __Z20assign_i8u_reg8_imm8v; .scl 2; .type 32; .endef
__Z20assign_i8u_reg8_imm8v:
LFB13:
.cfi_startproc
movl _ip, %eax
movb 3(%eax), %cl
movzbl 2(%eax), %eax
movl _sp, %edx
movb %cl, (%edx,%eax)
addl $4, _ip
ret
.cfi_endproc
LFE13:
.p2align 2,,3
.globl __Z18assign_i8u_reg_regv
.def __Z18assign_i8u_reg_regv; .scl 2; .type 32; .endef
__Z18assign_i8u_reg_regv:
LFB14:
.cfi_startproc
movl _ip, %edx
movl _sp, %eax
movzbl 3(%edx), %ecx
movb (%ecx,%eax), %cl
movzbl 2(%edx), %edx
movb %cl, (%eax,%edx)
addl $4, _ip
ret
.cfi_endproc
LFE14:
.p2align 2,,3
.globl __Z24assign_i8u_reg_globCachev
.def __Z24assign_i8u_reg_globCachev; .scl 2; .type 32; .endef
__Z24assign_i8u_reg_globCachev:
LFB15:
.cfi_startproc
movl _ip, %eax
movl _sp, %edx
movl 4(%eax), %ecx
addl %edx, %ecx
movl %ecx, 4(%eax)
movb (%ecx), %cl
movzwl 2(%eax), %eax
movb %cl, (%eax,%edx)
addl $8, _ip
ret
.cfi_endproc
LFE15:
.p2align 2,,3
.globl __Z19assign_i8u_reg_globv
.def __Z19assign_i8u_reg_globv; .scl 2; .type 32; .endef
__Z19assign_i8u_reg_globv:
LFB16:
.cfi_startproc
movl _ip, %eax
movl 4(%eax), %edx
movb (%edx), %cl
movzwl 2(%eax), %eax
movl _sp, %edx
movb %cl, (%edx,%eax)
addl $8, _ip
ret
.cfi_endproc
This example contains the instructions to:
assign unsigned byte from immediate value to register
assign unsigned byte from register to register
assign unsigned byte from global offset to register and, cache and change to direct instruction
assign unsigned byte from global offset to register (the now cached previous version)
... and so on...
Naturally, when I produce a compiler for it, I will be able to test the instruction flow in production code and optimize the arrangement of the instructions in memory to pack together the frequently used ones and get more cache hits.
I just have a hard time figuring if such a strategy is a good idea, the bloat will make up for flexibility, but what about performance? Will more compiled routines make up for a larger dispatch loop? Is it worth caching global addresses?
I would also like for someone, decent in assembly to express an opinion on the quality of the code that is generated by GCC - are there any obvious inefficiencies and room for optimization? To make the situation clear, there is a sp pointer, which points to the stack that implements the registers (there is no other stack), ip is logically the current instruction pointer, and gp is the global pointer (not referenced, accessed as an offset).
EDIT: Also, this is the basic format I am implementing the instructions in:
INSTRUCTION assign_i8u_reg16_glob() { // assign unsigned byte to reg from global offset
FETCH(globallAddressCache);
REG(quint8, i.d16_1) = GLOB(quint8);
INC(globallAddressCache);
}
FETCH returns a reference to the struct, which the instruction is using based on the opcode
REG returns a reference to register value T from offset
GLOB retursn a reference to global value from a cached global offset (effectively absolute address)
INC just increments the instruction pointer by the size of the instruction.
Some people will probably suggest against the usage of macroses, but with templates it is much less readable. This way the code is pretty obvious.
EDIT: I would like to add a few points to the question:
I could go for a "register operations only" solution which can only move data between registers and "memory" - be that global or heap. In this case, every "global" and heap access will have to copy the value, modify or use it, and move it back to update. This way I have a shorter dispatch loop, but a few extra instructions for each instruction that addresses non-register data. So the dilemma is a few times more native code with longer direct jumps, or a few times more interpreted instructions with shorter dispatch loop. Will a short dispatch loop give me enough performance to make up for the extra and costly memory operations? Maybe the delta between the shorter and longer dispatch loop is not enough to make a real difference? In terms of cache hits, in terms of the cost of assembly jumps.
I could go for additional decoding and only 8bit wide instructions, however, this may add another jump - jump to wherever this instruction is handled, then waste time on either jumping to the case the particular addressing scheme is handled or decoding operations and a more complex execution method. And in the first case, the dispatch loop still grows, plus adding yet another jump. The second option - register operations can be used to decode the addressing, but a more complex instruction with more compile time unknown will be needed in order to address anything. I am not really sure how will this stack up with a shorter dispatch loop, once again, uncertain how my "shorter and longer dispatch loop" relates to what is considered short or long in terms of assembly instructions, the memory they need and the speed of their execution.
I could go for the "many instructions" solution - the dispatch loop is a few times larger, but it still uses pre-computed direct jumping. Complex addressing is specific and optimized for each instruction and compiled to native, so the extra memory operations that would be needed by the "register only" approach will be compiled and mostly executed on the registers, which is good for performance. Generally, the idea is add more to the instruction set but also add to the amount of work that can be compiled in advance and done in a single "instruction". The loner instruction set also means longer dispatch loop, longer jumps (although that can be optimized to minimize), less cache hits, but the question is BY HOW MUCH? Considering every "instruction" is just a few assembly instructions, is an assembly snippet of about 7-8k instructions considered normal, or too much? Considering the average instruction size varies around 2-3b, this should not be more than 20k of memory, enough to completely fit in most L1 caches. But this is not concrete math, just stuff I came at googling around, so maybe my "calculations" are off? Or maybe it doesn't work that way? I am not that experienced in caching mechanisms.
To me, as I currently weight the arguments, the "many instructions" approach appears to have the biggest chances for best performance, provided of course, my theory about fitting the "extended dispatch loop" in the L1 cache holds. So here is where your expertise and experience comes into play. Now that the context is narrowed and a few support ideas presented, maybe it will be easier to give a more concrete answer whether the benefits of a larger instruction set prevail over the size increase of native code by decreasing the amount of the slower, interpreted code.
My instruciton size data is based on those stats.
You might want to consider separating the VM ISA and its implementation.
For instance, in a VM I wrote I had a "load value direct" instruction. The next value in the instruction stream wasn't decoded as an instruction, but loaded as a value into a register. You can consider this one macro instruction or two separate values.
Another instruction I implemented was a "load constant value", which took loaded a constant from memory (using a base address for the table of constants and an offset). A common pattern in the instruction stream was therefore load value direct (index); load constant value. Your VM implementation may recognize this pattern and handle the pair with a single optimized implementation.
Obviously, if you have enough bits, you can use some of them to identify a register. With 8 bits it may be necessary to have a single register for all operations. But again, you could add another instruction with register X which modifies the next operation. In your C++ code, that instruction would merely set the currentRegister pointer which the other instructions use.
Will more compiled routines make up for a larger dispatch loop?
I take it you didn't fancy having single byte instructions with a second byte of extra opcode for certain instructions? I think a decode for 16-bit opcodes may be less efficient than 8-bit + extra byte(s), assuming the extra byte(s) aren't too common or too difficult to decode in themselves.
If it was me, I'd work on getting the compiler (not necessarily a full-fledged compiler with "everything", but a basic model) going with a fairly limited set of "instructions". Keep the code generation part fairly flexible so that it'll be easy to alter the actual encoding later. Once you have that working, you can experiment with various encodings and see what the result is in performance, and other aspects.
A lot of your minor question points are very hard to answer for anyone that hasn't done both of the choices. I have never written a VM in this sense, but I have worked on several disassemblers, instruction set simulators and such things. I have also implemented a couple of languages of different kinds, in terms of interpreted languages.
You probably also want to consider a JIT approach, where instead of loading bytecode, you interpret the bytecode and produce direct machine code for the architecture in question.
The GCC code doesn't look terrible, but there are several places where code depends on the value of the immediately preceding instruction - which is not great in modern processors. Unfortunately, I don't see any solution to that - it's a "too short code to shuffle things around" problem - adding more instructions obviously won't work.
I do see one little problem: Loading a 32-bit constant will require that it's 32-bit aligned for best performance. I have no idea how (or if) Java VM's deal with that.
I think you are asking the wrong question, and not because it is a bad question, on the contrary, it is an interesting subject and I suspect many people are interested in the results just as I am.
However, so far no one is sharing similar experience, so I guess you may have to do some pioneering. Instead of wondering which approach to use and waste time on the implementation of boilerplate code, focus on creating a “reflection” component that describes the structure and properties of the language, create a nice polymorphic structure with virtual methods, without worrying about performance, create modular components you can assemble during runtime, there is even the option to use a declarative language once you have established the object hierarchy. Since you appear to use Qt, you have half the work cut out for you. Then you can use the tree structure to analyze and generate a variety of different code – C code to compile or bytecode for a specific VM implementation, of which you can create multiple, you can even use that to programmatically generate the C code for your VM instead of typing it all by hand.
I think this set of advices will be more beneficial in case you resort to pioneering on the subject without a concrete answer in advance, it will allow you to easily test out all the scenarios and make your mind based on actual performance rather than personal assumptions and those of others. Then maybe you can share the results and answer your question with performance data.
The instruction length in bytes has been handled the same way for quite a while. Obviously being limited to 256 instructions isn't a good thing when there's so many types of operations you wish to perform.
This is why there's an prefix value. Back in the gameboy architecture, there wasn't enough room to include the needed 256 bit-control instructions, that's why one opcode was used as a prefix instruction. This kept the original 256 opcodes as well as 256 more if starting with that prefix byte.
For example:
One operation might look like this: D6 FF = SUB A, 0xFF
But a prefixed instruction would be presented as: CB D6 FF = SET 2, (HL)
If the processor read CB it'd immediately start looking in another instruction set of 256 opcodes.
The same goes for x86 architecture today. Where any instructions prefixed with 0F would be a part of another instruction set, essentially.
With the sort of execution you're using for your emulator, this is the best way of extending your instruction set. 16-bit opcodes would take up way more space than necessary, and the prefix doesn't provide such a long search.
One thing you should decide is what balance you wish to strike between code-file size efficiency, cache efficiency, and raw-execution-speed efficiency. Depending upon the coding patterns for the code you're interpreting, it may be helpful to have each instruction, regardless of its length in the code file, get translated into a structure containing a pointer and an integer. The first pointer would point to a function that takes a pointer to the instruction-info structure as well as to the execution context. The main execution loop would thus be something like:
do
{
pc = pc->func(pc, &context);
} while(pc);
the function associated with an "add short immediate instruction" would be something like:
INSTRUCTION *add_instruction(INSTRUCTION *pc, EXECUTION_CONTEXT *context)
{
context->op_stack[0] += pc->operand;
return pc+1;
}
while "add long immediate" would be:
INSTRUCTION *add_instruction(INSTRUCTION *pc, EXECUTION_CONTEXT *context)
{
context->op_stack[0] += (uint32_t)pc->operand + ((int64_t)(pc[1].operand) << 32);
return pc+2;
}
and the function associated with an "add local" instruction would be:
INSTRUCTION *add_instruction(INSTRUCTION *pc, EXECUTION_CONTEXT *context)
{
CONTEXT_ITEM *op_stack = context->op_stack;
op_stack[0].asInt64 += op_stack[pc->operand].asInt64;
return pc+1;
}
Your "executables" would consist of compressed bytecode format, but they would then get translated into a table of instructions, eliminating a level of indirection when decoding the instructions at run-time.

Where is the one to one correlation between the assembly and cpp code?

I tried to examine how the this code will be in assembly:
int main(){
if (0){
int x = 2;
x++;
}
return 0;
}
I was wondering what does if (0) mean?
I used the shell command g++ -S helloWorld.cpp in Linux
and got this code:
.file "helloWorld.cpp"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1"
.section .note.GNU-stack,"",#progbits
I expected that the assembly will contain some JZ but where is it?
How can I compile the code without optimization?
There is no direct, guaranteed relationship between C++ source code and
the generated assembler. The C++ source code defines a certain
semantics, and the compiler outputs machine code which will implement
the observable behavior of those semantics. How the compiler does this,
and the actual code it outputs, can vary enormously, even over the same
underlying hardware; I would be very disappointed in a compiler which
generated code which compared 0 with 0, and then did a conditional
jump if the results were equal, regardless of what the C++ source code
was.
In your example, the only observable behavior in your code is to return
0 to the OS. Anything the compiler generates must do this (and have
no other observable behavior). The code you show isn't optimal for
this:
xorl %eax, %eax
ret
is really all that is needed. But of course, the compiler is free to
generate a lot more if it wants. (Your code, for example, sets up a
frame to support local variables, even though there aren't any. Many
compilers do this systematically, because most debuggers expect it, and
get confused if there is no frame.)
With regards to optimization, this depends on the compiler. With g++,
-O0 (that's the letter O followed by the number zero) turns off all
optimization. This is the default, however, so it is effectively what
you are seeing. In addition to having several different levels of
optimization, g++ supports turning individual optimizations off or on.
You might want to look at the complete list:
http://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc/Optimize-Options.html#Optimize-Options.
The compiler eliminates that code as dead code, e.g. code that will never run. What you're left with is establishing the stack frame and setting the return value of the function. if(0) is never true, after all. If you want to get JZ, then you should probably do something like if(variable == 0). Keep in mind that the compiler is in no way required to actually emit the JZ instruction, it may use any other means to achieve the same thing. Compiling a high level language to assembly is very rarely a clear, one-to-one correlation.
The code has probably been optimized.
if (0){
int x = 2;
x++;
}
has been eliminated.
movl $0, %eax is where the return value been set. It seems the other instructions are just program init and exit.
There is a possibility that the compiler optimized it away, since it's never true.
The optimizer removed the if conditional and all of the code inside, so it doesn't show up at all.
the if (0) {} block has been optimized out by the compiler, as this will never be called.
so your function do only return 0 (movl $0, %eax)

How does a C++ compiler compile variable names? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I understand I did not make myself clear. My doubt, I think, could be summed up in this:
In an executable file(machine code) how are "variables" represented? Are they static memory addresses? Does the compiler gives each one a specific "name" (or just keeps the one you gave them)?
Expressed in code:
int x=5;
//Bunch of code
cin>>y;
cout<<x+1;
How does the program in each and every machine knows which address is going to hold the value 5, to hold the inputed value, to add 1 to the value it now holds and finally print that same value.
--João
It's implementation-specific.
Typically, the location of variables will be based on all sorts of factors and optimizations. They may not live in RAM at all, as they may be optimised to live entirely within registers, or optimised away entirely.
Variable names don't exist at run-time; they're discarded during compilation. However, the compiler may emit debug information that's stored in the application binary, to allow the developers to debug the application. This is usually removed in release versions, though.
I have no idea about the specifics of Gameshark. But in many cases, the location of a particular variable can be figured out by taking a look at the machine code for the application.
Here is a simple program in C:
int main() {
int a = 5;
int b = 7;
int c = a + b;
return 0;
}
If you compile it with gcc -m32 -S -O0 -o main.s main.c under Linux, you'll get something like this
.file "main.c"
.text
.globl main
.type main, #function
main:
.LFB0:
/* %ebp is a Base Pointer Register */
pushl %ebp
movl %esp, %ebp
/* Here we reserve space for our variables */
subl $16, %esp
/* a's address is %ebp - 4 */
movl $5, -4(%ebp)
/* b's address is %ebp - 8 */
movl $7, -8(%ebp)
/* a + b */
movl -8(%ebp), %eax
movl -4(%ebp), %edx
addl %edx, %eax
/* c's address is %ebp - 12 */
movl %eax, -12(%ebp)
/* return 0 */
movl $0, %eax
leave
ret
As you can see, in this case, variables' addresses are calculated as offsets of a base pointer of a function. If you enable optimisations, variables' values may be stored in registers.
So there are two parts to this, and I'll do my best.
When compiling, a compiler will convert the C++ code into an internal representation. This is then converted into using the CPU's registers as efficiently as possible, and pushing the rest of the data into RAM. As the program executes, data from ram will get copied back and forth into registers.
On your other question, one method I've seen that people use for this is for the gold that a user has. A program could take the entire memory space of the game and copy it. Then, the user does something (a minimal action) to gain or lose gold. The external application then searches through the entire memory space for what values have changed, and what previously was the original amount of gold, and what is now the current amount of gold. Once they find this location, they are able to edit the memory location and update it with whatever value they want.
Generally, the more complicated the game is, the harder that method is.

Get the "size" (length) of a C++ function? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to get the length of a function in bytes?
I'm making a Hooking program that will be used to insert a method into the specified section of memory.
I need to get the length of a local C++ function, I've used a cast to get the location of a function, but how would I get the length?
would
int GetFuncLen()
{
int i = 0;
while((DWORD*)Function+i<max)
{
if((DWORD*)Function+i==0xC3)
{
return i;
}
i++;
}
}
work?
Your code seems to be operating system, compiler, and machine architecture specific.
(I know nothing about Windows)
It could be wrong if max is not defined.
It is operating system specific (probably Windows only) because DWORD is not a standard C++ type. You could use intptr_t (from <cstdint> header).
Your code is compiler specific, because you assume that every compiled function has a well defined unique end, and don't share any code with some other functions. (Some compilers are able to do such optimizations, and e.g. make two functions sharing a common epilogue or code chunk, using jump instructions).
Your code is machine specific, because you assume that the last instruction would be a RET coded 0xC3 and this is specific to x86 & x86-64 (won't work on Alpha or ARM, on which Windows is rumored to have been or to be ported). Also, that byte could appear inside other instructions or inlined constants (as Mat commented).
I am not sure that the notion of where a binary function ends has a well defined meaning. But if it does, I would expect that the linker may know about it. On some systems, for example on Linux with ELF executable, the compiler and the linker produces the size of each function.
Perhaps you better need to find the symbol near to a given address. I don't know if Windows has such a functionality (on Linux, the dladdr GNU function from <dlfcn.h> could be useful). Perhaps your operating system provides an equivalent?
No. For a few reasons.
1) 0xC3 is only a 'ret' instruction if it is at the point where a new instruction is expected. There could easily be other instructions that include a 0xc3 byte within their operands, and you'd only get part of the code.
2) There can be multiple 'ret' instructions in any given function, depending on the compiler and it's settings. Again, you'd only get part of the function.
3) Functions often use constructs like "switch" statements, that use "jump tables" that are located AFTER the ret instruction. Again, you'd only get part of the function.
And what you're trying to do is not likely to work anyway.
The biggest problem is that various assembly instructions will often reference specific areas of memory by using offsets rather than absolute addresses. So while extremely minimal functions might work, any functions that call out into other functions will likely fail.
Assuming you're trying to load these functions into an external process, and you're trying to do this on Windows, a better solution is to use DLL injection to load a DLL into your target process.
If you really need to inject the memory, then you'll need an assembly language parser for your particular platform to update all of the memory addresses for the relevant instructions, which is a very complex task. Or you could write your functions in assembly language and make sure that you're not using relative offsets for anything other than referencing parts of your own code, which is a bit easier, but more limiting in what you can do.
You could force your function to be put in a section all by itself (see eg http://msdn.microsoft.com/en-us/library/s20kdbse(v=VS.71).aspx).
I think that if you define a section, declare a variable in it, define your function in it, then define another variable in it then the addresses of the two variables will cover your function.
Better is to put the two variables and the function in separate sections and then use section merging to control the order they appear in the resulting code (see How to refer to the start-of a user-defined segment in a Visual Studio-project?)
As others have pointed out you probably can't do anything useful with this, and it's not at all portable.
The only reliable way to do this is to compile your code with a dummy number for the length of the function (but not run it), disassemble it, and calculate the length by hand, then take that number and substitute it for the dummy number, and recompile your program.
When I needed to do this, I just made a guess as to how big the function should be. As long as your guess is not to small (and not way way too big) you should have no problems.
You can use
objdump
to get the size of objects with external linkage. Otherwise, you could take the assembly output of the compiler (gcc -S, e.g.) and assemble it manually, you'll have the opportunity to see what names the length fields get:
.file "test.cpp"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.6.0-3~ppa1) 4.6.1 20110409 (prerelease)"
.section .note.GNU-stack,"",#progbits
See the .size main, .-main evaluation: it calculates the function size