Using Visual Studio, I am running into some trouble with an int type variable and a float type variable. They are both stored in their own arrays. When I go to print them out they come out as memory location gibberish. When I debug I noticed that the correct value is displayed next to the memory gibberish in the watch area. I also noticed that under type, the variable types have an * (asterisk) next to them. Could anybody offer information as to why this would happen? Thanks in advance.
Watch area looks like this...
Name Value Type
score 0x002ff5c8 {96.0000000} float *
studentID 0x002ff698 {9317} int *
I recommend reading into the tutorial above and perhaps an introductory book depending on how interested you are in pursuing learning c++. The type is a pointer type to int and float data. Here is a small example that answers the question (how to print out these values):
float* a = new float(5.8);
the pointer is established, this points to a memory location where a float with the value 5.8 is stored.
std::cout << *a;
The asterisk before a is called dereferencing a pointer, this is how the data is accessed, you may want to check to make sure you have a valid pointer or your program can crash.
delete a;
delete memory allocated when it will no longer be used(EDIT 2), this will free the space given to store a (failing to do so causes a memory leak)
EDIT 1:
Consider that the pointer may point to a contiguous array of floats or int (which is much more likely than just one), then you will have to know the size of the array you are reading to access the elements. In this case, you will use the operator [] to access the members, let's say we have an array b,
float* b = new float[2] {0.0,1.0};
to print it's members you would have to access each element
std::cout << b[0] << ' ' << b[1];
the delete operator looks like this for arrays
delete[] b;
EDIT 2:
Whenever you use new to dynamically allocate memory, think about the scope of the variable, when the scope is over delete the pointer. User is correct, you do not want to delete a pointer which may be used later, nor is it necessary to call delete to pointers obtained from references.
First off you are using the debugger. This is awesome. The sheer number of SO questions that could be solved in five minutes with a debugger is staggering. You are already far, far ahead of the game than a lot of the time-wasting sad sacks who can't be bothered to use the expletive deleted tools that came with the compiler.
Second, some important reading because it explains part of what is going on: What is array decaying?
Now to break down what the debugger is showing you
score 0x002ff5c8 {96.0000000} float *
score: Obviously the variable's name
float *: score is a variable of type float *, a pointer to a float.
0x002ff5c8: This is the data value of score. Pointers are a reference to a location in memory. Rather than being data, they point to data. So a pointer is a variable that contains where to find, the address of, another variable. 002ff5c8 is the hexadecimal location in memory where you will find what score points to.
{96.0000000}: score points to a floating point value that has been set to 96 (possibly plus or minus some fuzziness because not all numbers can be exactly represented with floating point)
So the crazy number 0x002ff5c8 tells the program where to find score's data, and this data happens to be 96.
Note the debugger only shows you the first value in the array of data that is at score, which brings us back to array decaying. Odds are good that the program has knowledge of how much data is pointed at by score. Could be one float. Could be a million. You have to carry the length of a block of an array around with it once the array has decayed.
Related
I have a this fragment of code in C++:
char x[50];
cout << x << endl;
which outputs some random symbols as seen here:
So my first question: what is the reason behind this output? Shouldn't it be spaces or at least same symbols?
The reason I am concerned with this is that I am writing program in CUDA and I'm doing some character manipulations inside __global__ function, hence the use of string gives a "calling host function is not allowed" error.
But if I am using "big enough" char array (each chunk of text I am operating with differs in size, meaning that it will not always utilize char array fully) it's sometimes not fully filled and I left with junk like in the picture below hanging at the end of text:
So my second question: is there any way to avoid this?
what is the reason behind this output?
The values in an automatic variable are indeterminate. The standard doesn't specify it, so it might be spaces as you said, it might be random content.
[...] sometimes not fully filled and I left with junk [...]
Strings in C are null-terminated, so any routine dedicated to printing a string will loop as long as no null byte is encountered. In uninitialized memory, this null byte occurs randomly (or not at all). These weird, trailing characters are a result of that.
is there any way to avoid this?
Yes. Initialize it.
(will assume x86 in this post)
what is the reason behind this output?
Here's roughly what happens, in assembly, when you do char x[50];:
ADD ESP, 0x34 ; 52 bytes
Essentially, the stack is moved up by 0x34 bytes (must be divisible by 4). Then, that space on the stack becomes x. There's no cleaning, no changes or pushes or pops, just this space becoming x. Anything that was there before (abandoned params, return addresses, variables from previous function calls) will be in x.
Here's roughly what happens when you do new char[50]:
1. Control gets passed to the allocator
2. The allocator looks for any heap of sufficient size (readas: an already allocated but uncommited heap)
3. If 2 fails, the allocator makes a new heap
4. The allocator takes the heap (either the found or allocated one) and commits it
5. The address of that heap is returned to your code where it is used as a char*
The same as with a stack, you get whatever data is there. Some programs or systems may have allocators that zero out heaps when they are allocated or committed, where others may only zero when allocated but not committed, and some may not zero at all. Depending on the allocator, you may get clean memory or you may get re-used and dirty memory. This is why the values here can be non-zero and aren't predictable.
is there any way to avoid this?
In the case of heap memory, you can overload the new and delete operators in C++ and always zero newly allocated memory. You can see examples of overloading these operators here. As for memory on the stack, you just have to live with zeroing it out every time.
ZeroMemory(myArray, sizeof(myarray));
Alternatively, for both methods, you could stay away from naked arrays and use std::vector or other wrappers that take care of initialization for you. You'll still want to make sure to initialize integers and other numeric or pointer data-types, though.
No, there is no way to avoid it. C++ does not initialize automatic variables of built-in types (such as arrays of built-in types in your case) automatically, you need to initialize them yourself.
Why are you having issues with this code?
char x[50];
cout << new char[50] << endl;
cout << x << endl;
You're leaking memory with the 'new char[50] without a corresponding delete.
Also, uninitialized memory is undefined as others have said and in most cases you get garbage within that memory block. A better method is to initialize it:
char x[50] = {};
char* y = new char[50]();
Then just remember to call delete on y later to free the memory. Yes, the OS will do it for you, but this is never a way to write good programs though.
If pointer data type is same as the newly entered data,i guess it wouldn't give an error,but if the pointer has a different data type ,we'll have a type mismatch. I was wondering whether the compiler would do something about it(say delete the dangling pointer first),or simply give an error.
#YuHao is absolutely right.
If you delete first, you might get a segmentation fault, if that unmaps a page that previously existed in your process' address space.
In any other case, you just write data somewhere; there might be useful stuff there by the time you do that, there might not. At any rate, you must avoid this.
I was wondering whether the compiler would do something about it(say delete the dangling pointer first)
Could do (won't)
or simply give an error.
Could do (won't)
It's a dangling pointer. There's no protection against that.
It's undefined behaviour. This is fundamentally unanswerable.
Perhaps a quick overview so we are on the same page will help.
Let's assume a modern PC with a multi-tasking OS like Linux.
When a C++ program is run, a process is created that has a private memory space. This is a linear mapping of addresses that get translated by the CPU and OS to real addresses in RAM.
C++ is a strongly typed, low-level language with manual memory management. Strongly typed means the compiler does some basic checks to make sure logical statements in your program make sense. Pointers are just another type.
For instance:
float f = 10.0f; // this is ok, 10.0f is a float literal
float* pF = &f; // types match, & operator returns type float*
int i = f; // types do not match. Compiler error or warning.
int* i = pF; // types do not match, int* is not float*
float f2 = pF; // types do not match, float is not float*
and so on.
Thats's compile time. That's really it. Once the program is running, the C++ runtime is pretty dumb. It doesn't do too many checks on memory operations since those can slow a program down and C++'s philosophy is "if you didn't ask for it, you don't pay for it".
At it the most fundamental our memory is just a sequence of bytes. Data types like float and int are multi-byte data types (4 bytes for 32-bit platforms). That means a float in memory is stored in 4 adjacent byte-sized slots.
Finally, we are ready to answer your question. If you allocate memory at runtime through something like new you are handed back a pointer to memory you can use. Say we do this for a single float. new knows how to mark that memory as "in use". new won't give a pointer to those 4 bytes to anyone else, so you are safe. When you invoke delete that gives the memory back to the heap - some other part of your program is free to allocate it later. But the pointer you have is unmodified. We can still use the pointer to write to memory, only now we are headed for trouble.
Example:
float *pF = new float; // allocate 4-bytes on a 32-bit system
*pF = 10.0f; // fine
delete pF; // free the memory
*pF = 20.0f; // ?????
That last instruction says "write 20.0f to the memory pointed to by pF". We don't "own" that memory any longer. We say this pointer is dangling as it doesn't point to valid memory we can write to safely. But it does point to writable memory. You are correct that this is a source of bugs.
There are C++ memory allocators that will write special values into memory to indicate whether it is uninitialized or previously deleted. This will depend on your OS and toolset.
Another option to find bugs like this is to use the awesome tool Valgrind which will simulate your program memory and flag these kinds of bugs.
I am having this piece of code:
try
{
int* myTestArray = new int[2];
myTestArray[4] = 54;
cout << "Should throw ex " << myTestArray[4] + 1 << endl;
}
catch (exception& exception)
{
cout << "Exception content: " << exception.what() << endl;
}
What is really curios for me, is that why the exception is not thrown here, since it was accessed an index which was not allocated... and why 55 is print ? Is that C++ automatically increased the size of the array ?
Accessing unallocated memory is not guaranteed to throw exceptions.
It's actually not guaranteed to do anything, since that's undefined behavior. Anything could happen. Beware of nasal demons.
It prints 55 because you just stored 54, fetched it back and then printed 54+1. It's not at all guaranteed to print 55, although that's often what will happen in practice. This time it worked.
There is an unstated, and incorrect, assumptions here. That assumption is that C++ actually gives a damn about what you do with memory. C++, like its C ancestor, has a completely unchecked model of memory. What you have here is classically called a buffer overflow, and is a source of innumerable bugs including some horrible security flaws.
Here's what your code really says:
myTestArray is the name of a location in memory big enough to hold the address of an int.
Two ints worth of memory have been allocated on the heap for it. [And that addreress is put into the location myTestArray. Doesn't matter, but that probably makes it clearer.] (Along with probably 16 bytes of overhead, but we don't care about that now.)
you then are sticking the value 54 into the memory location 4 ints from the address contained in myTestArray.
looking at that location, adding 1 and printing the result.
You are demonstrating that C(++) indeed just doesn't care.
Now, under most conditions the underlying memory management and run time system won't let you get away with it; you will violate it's assumptions and get a segmentation error or something similar. But in this case, you are not hitting a boundary yet, most likely because you're piddling on the data structure that malloc is using under the covers to manage the heap. You're getting away with it because nothing is happening with the heap for the rest of the program. But for a real good time, write a little loop that does this code, freeing myTestArray and reallocating it. I'd lay long odds it won't run for more than 10 iterations before the program blows up, and might not make two.
Knowing what's going on here for sure is very hard to do. But I can give you a rough idea.
Most operating systems have a minimum size for memory allocations. In Unix it is the native page size. On x86 and amd64 systems this is 4 kB. In Windows it is 64 kB (I think).
The memory allocator used by malloc and new gets memory from the operating system in chunks of this size. It sets up data structures (often a linked list, sometimes a bitmap, or a tree) and hands out small pieces of the requested sizes.
One other confusing thing is that before your program even starts running main() it has run quite a bit of other code and allocated memory. For std::cout and other static and global objects, and for shared library linking.
But assume that when you call new your program first gets a chunk of 4 kB and gives you a pointer to 8 bytes of it (two integers). Your program has the entire 4 kB allocated and you can write there without crashing. However, what happens if you call new again? It is very likely that the memory allocator wrote some important tracking information somewhere into that 4 kB. The next bytes might be the size of the following block. Writing 54 into it might make it think it has more or less memory than it does. Or those bytes might be a pointer to the next block of free memory, and your 54 will cause the next memory allocation to crash the program.
You can write out of array range, but it is not guaranteed to work, and the data is not guaranteed to be persistent there, as something else can overwrite it.
It's simply not a good idea, and since there's no exception, potentially hard to find bug.
When reading that memory, you will be pulling some random garbage that was there left over from some other program or whatever used the memory before, so it can really be anything.
As others have said, this is undefined behavior, but I thought a bit more info might help. myTestArray is not an "Array" in the sense of a type, with special operators, etc. It is just a pointer to a location in memory. The expression myTestArray[4] is just short-hand for *(myTestArray+4) - it is returning a reference to the memory location that is 4 * sizeof(int) past myTestArray. If you want bounds checking, you'll have to use std::vector<int>::at().
Accessing array out of range is undefined behavior. Thus 55 is one of many possible results and there is nothing surprising here.
C++ Standard n3337 § 5.7 Additive operators
5) When an expression that has integral type is added to or subtracted
from a pointer, the result has the type of the pointer operand. If the
pointer operand points to an element of an array object, and the array
is large enough, the result points to an element offset from the
original element such that the difference of the subscripts of the
resulting and original array elements equals the integral expression.
In other words, if the expression P points to the i-th element of an
array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N
(where N has the value n) point to, respectively, the i + n-th and i −
n-th elements of the array object, provided they exist. Moreover, if
the expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array object,
and if the expression Q points one past the last element of an array
object, the expression (Q)-1 points to the last element of the array
object. If both the pointer operand and the result point to elements
of the same array object, or one past the last element of the array
object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined.
My background is C++ and I'm currently about to start developing in C# so am doing some research. However, in the process I came across something that raised a question about C++.
This C# for C++ developers guide says that
In C++ an array is merely a pointer.
But this StackOverflow question has a highly-upvoted comment that says
Arrays are not pointers. Stop telling people that.
The cplusplus.com page on pointers says that arrays and pointers are related (and mentions implicit conversion, so they're obviously not the same).
The concept of arrays is related to that of pointers. In fact, arrays work very much like pointers to their first elements, and, actually, an array can always be implicitly converted to the pointer of the proper type.
I'm getting the impression that the Microsoft page wanted to simplify things in order to summarise the differences between C++ and C#, and in the process wrote something that was simpler but not 100% accurate.
But what have arrays got to do with pointers in the first place? Why is the relationship close enough for them to be summarised as the "same" even if they're not?
The cplusplus.com page says that arrays "work like" pointers to their first element. What does that mean, if they're not actually pointers to their first element?
There is a lot of bad writing out there. For example the statement:
In C++ an array is merely a pointer.
is simply false. How can such bad writing come about? We can only speculate, but one possible theory is that the author learned C++ by trial and error using a compiler, and formed a faulty mental model of C++ based on the results of his experiments. This is possibly because the syntax used by C++ for arrays is unconventional.
The next question is, how can a learner know if he/she is reading good material or bad material? Other than by reading my posts of course ;-) , participating in communities like Stack Overflow helps to bring you into contact with a lot of different presentations and descriptions, and then after a while you have enough information and experience to make your own decisions about which writing is good and which is bad.
Moving back to the array/pointer topic: my advice would be to first build up a correct mental model of how object storage works when we are working in C++. It's probably too much to write about just for this post, but here is how I would build up to it from scratch:
C and C++ are designed in terms of an abstract memory model, however in most cases this translates directly to the memory model provided by your system's OS or an even lower layer
The memory is divided up into basic units called bytes (usually 8 bits)
Memory can be allocated as storage for an object; e.g. when you write int x; it is decided that a particular block of adjacent bytes is set aside to store an integer value. An object is any region of allocated storage. (Yes this is a slightly circular definition!)
Each byte of allocated storage has an address which is a token (usually representible as a simple number) that can be used to find that byte in memory. The addresses of any bytes within an object must be sequential.
The name x only exists during the compilation stage of a program. At runtime there can be int objects allocated that never had a name; and there can be other int objects with one or more names during compilation.
All of this applies to objects of any other type, not just int
An array is an object which consists of many adjacent sub-objects of the same type
A pointer is an object which serves as a token identifying where another object can be found.
From hereon in, C++ syntax comes into it. C++'s type system uses strong typing which means that each object has a type. The type system extends to pointers. In almost all situations, the storage used to store a pointer only saves the address of the first byte of the object being pointed to; and the type system is used at compilation time to keep track of what is being pointed to. This is why we have different types of pointer (e.g. int *, float *) despite the fact that the storage may consist of the same sort of address in both cases.
Finally: the so-called "array-pointer equivalence" is not an equivalence of storage, if you understood my last two bullet points. It's an equivalence of syntax for looking up members of an array.
Since we know that a pointer can be used to find another object; and an array is a series of many adjacent objects; then we can work with the array by working with a pointer to that array's first element. The equivalence is that the same processing can be used for both of the following:
Find Nth element of an array
Find Nth object in memory after the one we're looking at
and furthermore, those concepts can be both expressed using the same syntax.
They are most definitely not the same thing at all, but in this case, confusion can be forgiven because the language semantics are ... flexible and intended for the maximum confusion.
Let's start by simply defining a pointer and an array.
A pointer (to a type T) points to a memory space which holds at least one T (assuming non-null).
An array is a memory space that holds multiple Ts.
A pointer points to memory, and an array is memory, so you can point inside or to an array. Since you can do this, pointers offer many array-like operations. Essentially, you can index any pointer on the presumption that it actually points to memory for more than one T.
Therefore, there's some semantic overlap between (pointer to) "Memory space for some Ts" and "Points to a memory space for some Ts". This is true in any language- including C#. The main difference is that they don't allow you to simply assume that your T reference actually refers to a space where more than one T lives, whereas C++ will allow you to do that.
Since all pointers to a T can be pointers to an array of T of arbitrary size, you can treat pointers to an array and pointers to a T interchangably. The special case of a pointer to the first element is that the "some Ts" for the pointer and "some Ts" for the array are equal. That is, a pointer to the first element yields a pointer to N Ts (for an array of size N) and a pointer to the array yields ... a pointer to N Ts, where N is equal.
Normally, this is just interesting memory crapping-around that nobody sane would try to do. But the language actively encourages it by converting the array to the pointer to the first element at every opportunity, and in some cases where you ask for an array, it actually gives you a pointer instead. This is most confusing when you want to actually use the array like a value, for example, to assign to it or pass it around by value, when the language insists that you treat it as a pointer value.
Ultimately, all you really need to know about C++ (and C) native arrays is, don't use them, pointers to arrays have some symmetries with pointers to values at the most fundamental "memory as an array of bytes" kind of level, and the language exposes this in the most confusing, unintuitive and inconsistent way imaginable. So unless you're hot on learning implementation details nobody should have to know, then use std::array, which behaves in a totally consistent, very sane way and just like every other type in C++. C# gets this right by simply not exposing this symmetry to you (because nobody needs to use it, give or take).
Arrays and pointers in C and C++ can be used with the exact same semantics and syntax in the vast majority of cases.
That is achieved by one feature:
Arrays decay to pointers to their first element in nearly all contexts.
Exceptions in C: sizeof, _Alignas, _Alignas, address-of &
In C++, the difference can also be important for overload-resolution.
In addition, array notation for function arguments is deceptive, these function-declarations are equivalent:
int f(int* a);
int f(int a[]);
int f(int a[3]);
But not to this one:
int f(int (&a)[3]);
Besides what has already been told, there is one big difference:
pointers are variables to store memory addresses, and they can be incremented or decremented and the values they store can change (they can point to any other memory location). That's not the same for arrays; once they are allocated, you can't change the memory region they reference, e.g. you cannot assign other values to them:
int my_array[10];
int x = 2;
my_array = &x;
my_array++;
Whereas you can do the same with a pointer:
int *p = array;
p++;
p = &x;
The meaning in this guide was simply that in C# an array is an object (perhaps like in STL that we can use in C++), while in C++ an array is basically a sequence of variables located & allocated one after the other, and that's why we can refer to them using a pointer (pointer++ will give us the next one etc.).
it's as simple as:
int arr[10];
int* arr_pointer1 = arr;
int* arr_pointer2 = &arr[0];
so, since arrays are contiguous in memory, writing
arr[1];
is the same as writing:
*(arr_pointer+1)
pushing things a bit further, writing:
arr[2];
//resolves to
*(arr+2);
//note also that this is perfectly valid
2[arr];
//resolves to
*(2+arr);
First of all, I am a beginner when it comes to C++ programming. Yesterday I encountered something rather strange. I was trying to determine the length of an array via a pointer pointing towards it. Since sizeof didn't work I did a little Google search and ended up on this website where I found the answer that it was not possible. Instead I should put an out of bound value at the last index of the array and increment a counter until this index is reached. Because I didn't want to overwrite the information that was contained at the last index, I tried putting the out of bound value one index after the last one. I expected it to fail, but for some reason it didn't.
I thought that I made a mistake somewhere else and that the array was longer then I assigned it to be, so I made the following test:
int a[4];
a[20] = 42;
std::cout << a[20];
The output is 42 without any errors. Why does this work? This should not be valid at all, right? What's even more interesting is the fact that this works with any primitive type array. However, once I use a std::string the program instantly exists with 1.
Any ideas?
Your system just happens to not be using the memory that just happens to be 20 * sizeof(int) bytes further from the address of your array. (From the beginning of it.) Or the memory belongs to your process and therefore you can mess with it and either break something for yourself or just by lucky coincidence break nothing.
Bottom line, don't do that :)
I think what you need to understand is the following:
when you creating a[4] the compiler allocate memory for 4 integers and remember in a the address of the first one: (*a == &(a[0])).
when you read\write the compiler doesn't check if you in the bounds (because he doesn't longer have this information). and just go to the address of the requested cell of the array in the following way: a[X] == &(a + sizeof(int) * X)
in C++ it's the programmer responsibility to check the bounds when accessing an array.