Are there any repricussions having a value in an array stored at -1? could it affect the program or computer in a bad way? I am really curious, I'm new to programming and any clarification I can get really helps, thanks.
There's no way to store anything in an array object at index -1. A mere attempt to obtain a pointer to that non-existing element results in undefined behavior.
Negative indices (like -1) may appear in array-like contexts in situations when base pointer is not the array object itself, but rather an independent pointer pointing into the middle of another array object, as in
int a[10];
int *p = &a[5];
p[-1] = 42; // OK, sets `a[4]`
p[-2] = 5; // OK, sets `a[3]`
But any attempts to access non-existent elements before the beginning of the actual array result in undefined behavior
a[-1]; // undefined behavior
p[-6]; // undefined behavior
You see if you are trying to take element of array by pointing some value in brackets, basically you're specifying offset (multiplied by size of allocated type) from memory address. If you've allocated array in a typical way like int *a = new int[N], memory you're allowed to use is limited from address a until a + <size of memory allocated> (which in this case sizeof (int) * N), so by trying to get value with index -1 you are getting out of bounds of your array and it certainly will lead you to error or possible program crash.
There's of course a chance that your memory pointer is not the one at the beginning of some allocated sequence, like (considering previous example) int *b = a + 1, in this case you may (at least compiler allows that) take value of a[-1] and it would be valid, but since it's pretty hard to manage correctness of code like this I would still recommend against it.
Related
char a[] = "hello";
My understanding is that a acts like a constant pointer to a string. I know writing a++ won't work, but why?
No, it's not OK to increment an array. Although arrays are freely convertible to pointers, they are not pointers. Therefore, writing a++ will trigger an error.
However, writing
char *p = a;
p++;
is fine, becuase p is a pointer, with value equal to the location of a's initial element.
a++ is not well-formed since a decays to a pointer, and the result of the decay is not an lvalue (so there is no persistent object whose state could be "incremented").
If you want to manipulate pointers to the array, you should first create such a pointer:
char* p = a; // decayed pointer initializes p
a++; // OK
++a; // even OKer
This is a very good question actually. Before discussing this, let's back to the basic concepts.
What happens when we declare a variable ?
int a=10;
Well, we get a memory location to store the variable a. Apart from this an entry is created into Symbol table that contains the address of the variable and the name of the memory location (a in this case).
Once the entry is created, you can never change anything into the symbol table, means you can't update the address. Getting an address for a variable is not in our hand, it's done by our computer system.
Let's say, we get address 400 for our variable a.
Now computer has assigned an address for the variable a, so at a later point, we can't ask computer to change this address 400 because again, it's not in our hand, our computer system does it.
Now you have an idea about what happens when we declare a variable.let's come to our question.
Let's declare an array.
int arr[10]
So, when we declare this array, we create the entry into the symbol table and, store the address and the name of the array into the symbol table.
let's assume we get address 500 for this variable.
Let's see what happens when we want to do something like this :
arr++
when we increment arr, we want to increment 500, that is not possible and not in our hand, because it has been decided by the computer system, so we can't change it.
instead of doing this we can declare a pointer variable
int * p= &arr;
What happens in this situation is: again an entry is created into the symbol table that stores the name and the address of the pointer variable p.
So when we try to increment p by doing p++, we are not changing the value into the symbol table, instead we are changing the value of the address of the pointer variable, that we can do and we are allowed to do.
Also it's very obvious that if we will increment the a the ultimately we are going to loss the address of our array. if we loss the address of array then how will we access the array at a later point ?
It is never legal in C to assign to an expression of array type. Increment (++) involves assignment, and is thus also not legal.
What you showed at the top is a special syntax for initializing a char array variable.
I think this answer here explains "why" it's not a good idea;
It's because array is treated as a constant pointer in the function it is declared.
There is a reason for it. Array variable is supposed to point to the first element of the array or first memory instance of the block of the contiguous memory locations in which it is stored. So, if we will have the liberty to to change(increment or decrement ) the array pointer, it won't point to the first memory location of the block. Thus it will loose it's purpose.
It is said that often (but not always) when you get an AV in a memory location close to zero (like $89) you have an uninitialized pointer.
But I have seen this also in Delphi books... Hm... or they have been all written by the same author(s)???
Update:
Quote from "C++ builder 6 developers guide" by Bob Swart et all, page 71:
When the memory address ZZZZZZZZZ is close to zero, the cause is often
an uninitialized pointer that has been accessed.
Why is it so? Why uninitialized pointers contain low numbers? Why not big numbers like $FFFFFFF or plain random numbers? Is this urban myth?
This is confusing "uninitialized pointers" with null references or null pointers. Access to an object's fields, or indexes into a pointer, will be represented as an offset with respect to the base pointer. If that reference is null then the offsets will generally be addresses either near zero (for positive offsets) or addresses near the maximum value of the native pointer size (for negative offsets).
Access violations at addresses with these characteristic small (or large) values are a good clue that you have a null reference or null pointer, specifically, and not simply an uninitialized pointer. An uninitialized reference can have a null value, but may also have any other value depending on how it is allocated.
Why uninitialized pointers contain low numbers?
They don't. They can contain any value.
Why not big numbers like $FFFFFFF?
They can perfectly well contain values like $FFFFFFF.
or plain random numbers?
Uninitialised variables tend not to be truly random. They typically contain whatever happened to have been written to that memory location the last time it was used. For instance, it is very common for uninitialised local variables to contain the same value every time a function is called because the history of stack usage happens to be repeatable.
It's also worth pointing out that random is an often misused word. People often say random when they actually mean distributed randomly with uniform distribution. I expect that's what you meant when you used the term random.
Your statement about AV close to zero is true for dereferencing a null pointer. It is zero or close to zero because you either dereference the null pointer:
int* p{};
const auto v = *p; // <-- AV at memory location = 0
or access an array item:
char* p{};
const auto v = p[100]; // <--AV at memory location = 100
or a struct field:
struct Data
{
int field1;
int field2;
};
Data* p{};
const auto v = p->field2; // AV at memory location = 4
I read Are negative array indexes allowed in C? and found it interesting that negative values can be used for the index of an array. I tried it again with the c++11 unique_ptr and it works there as well! Of course the deleter must be replaced with something which can delete the original array. Here is what it looks like:
#include <iostream>
#include <memory>
int main()
{
const int min = -23; // the smaller valid index
const int max = -21; // the highest valid index
const auto deleter = [min](char* p)
{
delete [](p+min);
};
std::unique_ptr<char[],decltype(deleter)> up(new char[max-min+1] - min, deleter);
// this works as expected
up[-23] = 'h'; up[-22] = 'i'; up[-21] = 0;
std::cout << (up.get()-23) << '\n'; // outputs:hi
}
I'm wondering if there is a very, very small chance that there is a memory leak. The address of the memory created on the heap (new char[max-min+1]) could overflow when adding 23 to it and become a null pointer. Subtracting 23 still yields the array's original address, but the unique_ptr may recognize it as a null pointer. The unique_ptr may not delete it because it's null.
So, is there a chance that the previous code will leak memory or does the smart pointer behave in a way which makes it safe?
Note: I wouldn't actually use this in actual code; I'm just interested in how it would behave.
Edit: icepack brings up an interesting point, namely that there are only two valid pointer values that are allowed in pointer arithmetic:
§5.7 [expr.add] p5
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
As such, the new char[N] - min of your code already invokes UB.
Now, on most implementations, this will not cause problems. The destructor of std::unique_ptr, however, will (pre-edit answer from here on out):
§20.7.1.2.2 [unique.ptr.single.dtor] p2
Effects: If get() == nullptr there are no effects. Otherwise get_deleter()(get()).
So yes, there is a chance that you will leak memory here if it indeed maps to whatever value represents the null pointer value (most likely 0, but not necessarily). And yes, I know this is the one for single objects, but the array one behaves exactly the same:
§20.7.1.3 [unique.ptr.runtime] p2
Descriptions are provided below only for member functions that have behavior different from the primary template.
And there is no description for the destructor.
new char[max-min+1] doesn't allocate memory on the stack but rather on heap - that's how standard operator new behaves. The expression max-min+1 is evaluated by the compiler and results in 3, so eventually this expression is equal to allocating 3 bytes on the heap. No problem here.
However, subtracting min results in pointer which is 23 bytes beyond the beginning of the allocated memory returned by new and since in new you allocated only 3 bytes, this will definitely point to a location not owned by you --> anything following will result in undefined behavior.
What is the difference if I do
int *i = new int;
*i = 5;
*(i+1) = 20;
and
int *i2 = new int [2];
i2[0] = 5;
i2[1] = 20;
I can access and use these 2 pointers the same way but what is the difference between these 2 examples and what errors can occur if I don't allocate enough memory, as in the first example?
The difference is the first one invokes undefined behaviour. Anything could happen, including a program crash, or data corruption, or even simply just "working".
The first option writes to memory that hasn't been allocated. This could lead to unpredictable behaviour such as a crash.
In first case we have allocated memory for one integer. so we cannot do *(i+1) which will move to next location which have undefined behaviour i.e., it may crash immediatetly or later.
In later case we are allocating memory for two integers.
The most probable thing is the data corruption, but generally it's undefined.
There is no difference in what elements you are accessing. The syntax *(i+1) (pointer notation) and i[1] (array element access notation) are the same. In this case you can think of a pointer and an array as equivalent (hence the two ways of accessing the same element)
As the others have mentioned, you will have undefined behavior if you (try to) access memory that has not been allocated properly.
Consider this code:
int *p = new int;
cout << sizeof(*p);
delete p;
As expected the result is 4. Now, consider this other code:
int *p = new int[10];
cout << sizeof(*p);
delete[] p;
I expected to get 40 (the size of the allocated array), however the result is still 4.
Now, suppose I have a function int *foo() that returns a pointer to a structure created with new or with new[] (but I don't know which one):
int *p = foo();
My question is, is there a way (or hack) to know if p points to a single integer or an array of integers?
Please keep in mind that this is just a theoretical question. I won't be writing real code in this fashion.
No, there is no way of doing that. But you know the difference, because the code you wrote called new or new[].
The reason by the way that:
cout << sizeof(*p);
gives you 4 in both cases is because p is a pointer to an int, the expression *p means the thing pointed to by such a pointer (i.e. an int) and the size of an int on your platform is 4. This is all evaluated at compile time, so even if new[] did return a special value, sizeof would not be able to use it.
No, because your result is an address (that's why you get 4 for sizeof() in both cases). You created it, so you're expected to know what it is.
In both examples the type of p is the same: int *. sizeof operates on the type, not the data. It's computed at compile time.
You have a couple of choices. You can keep track of the array size yourself, or you can venture into using one of the containers in the standard library such as vector< int >. These containers will track the size (e.g. vector< int >::size()) for you.
sizeof(x) returns the amount of memory needed to contain x as declared.
There is no dynamic aspect to this at all.
sizeof (*foo) where foo is a bar * will always be the same as sizeof(bar)
No, there isn't any way.
Obligatory question: Why do you need to know?
If it's "because I need to know whether to say delete [] or delete", then just use arrays all the time, if for some obscure reason you can't figure out which one you used in your own code.
Having a function that can return a pointer to a single item or an array is a bad design decision. You can always return a pointer to an array of size 1:
return new int[1];
First, sizeof(*p) returns always a value to the integer, so it's always returning 4.
Now, how can you know whether p is pointing to int or int[] ?
There is no standard way of it. However, you can hack the platform and get it known. For example, if you try printing p[-1], p[-2], ..., p[-4] etc. for certain compilers (say linux in my case) then you will see a particular pattern in the value of this locations. However, this is just a hack and you cannot rely upon it always.