Pointer to [-1]th index of array - c++

How does a pointer points to [-1]th index of the array produce legal output everytime. What is actually happening in the pointer assignment?
#include<stdio.h>
int main()
{
int realarray[10];
int *array = &realarray[-1];
printf("%p\n", (void *)array);
return 0;
}
Code output:
manav#workstation:~/knr$ gcc -Wall -pedantic ptr.c
manav#workstation:~/knr$ ./a.out
0xbf841140
EDIT: If this scenario is valid, then can i use this to define an array whose index start from 1 instead of 0, namely: array[1], array[2],...

Youre simply getting a pointer that contains the address of that "imaginary" location, i.e. the location of the first element &realarray[0] minus the size of one element.
This is undefined behavior, and might break horribly if, for instance, your machine has a segmented memory architecture. It's working because the compiler writer has chosen to implement the arithmetic as outlined above; that could change at any moment, and another compiler might behave totally differently.

a[b] is defined as *(a+b)
therefore a[-1] is *(a-1)
Whether a-1 is a valid pointer and therefore the dereference is valid depends on the context the code is used in.

The behaviour is undefined.
What you have observed may have happened in your particular compiler and configuration, but anything may happen in a different situation. You cannot rely on this behaviour at all.

The behavior is undefined. You can only calculate a pointer to any of the elements of an array, or one past, but that's it. You can only dereference a pointer to any of the elements of an array (not the one past pointer). Looking at your variable names, looks like you're asking a question from this C FAQ. I think that the answer on the FAQ is very good.

Although, as others have noted, it is undefined behaviour in this case, it compiles without warnings because in general, foo[-1] might be valid.
For example, this is fine:
int realarray[10] = { 10, 20, 30, 40 };
int *array = &realarray[2];
printf("%d\n", array[-1]);

In C and C++, array indexes are not checked at runtime. You are performing pointer arithmetic which may or may not end up giving defined results (not here).
However, in C++ you can use an array class that does provide bounds checks, e.g boost::array or std::tr1::array (to be added to standard library in C++0x):
#include <cstdio>
#include <boost/array.hpp>
int main()
{
try {
boost::array<int, 10> realarray;
int* p = &realarray.at(-1);
printf("%p\n", (void *)p);
} catch (const std::exception& e) {
puts(e.what());
}
}
Output:
array<>: index out of range
Also produces a compiler warning:
8 test.cpp [Warning] passing negative
value -0x000000001' for converting 1
ofT& boost::array::at(size_t)
[with T = int, unsigned int N = 10u]'

It simply points to the address of the item just ahead of the array in memory.
The array can simply be thought of as being a pointer. This is then simply decremented by one.

Here you just performing the pointer arithmetic , It will get firs index address of the relarray
See, if you &relarray[+1] , you would get the second element address of the array. since
&relarray[0] is pointing the first index address.

array points to one location before the starting address of realarray. However, what confused me is why does this compiled without any warnings.

You're just pointing to the 4 bytes located before the array.

This is perfectly well defined. Your code is guaranteed to be accepted by all compilers, and never crash at run time. C/C++ pointers are a numeric data type that obey the rules of arithmetic. Addition and subtraction work, and the bracket notation [] is just a fancy syntax for addition. NULL is literally the integer 0.
And this is why C/C++ are dangerous. The compiler will let you create pointers that point anywhere without complaint. Dereferencing the wild pointer in your example, *array = 1234; would produce undefined behavior, anything from subtle corruption to a crash.
Yes, you could use it to index from 1. Don't do this! The C/C++ idiom is to always index from 0. Other people who saw the code indexing from 1 would be tempted to "fix" it to index from 0.

The experiment could have provided little more clue if it was the following. Instead of printing the pointer value as
printf("%p\n", (void *)array);
, print the array element value
printf("%d\n", *array);
Thats because printing a pointer with %p will always produce some output (without any misbehavior), but nothing can be deduced from it.

Related

Using std::sort with pointers to integer variables produces unexpected output

I decided to compile and run this piece of code (out of curiosity) and the G++ compiler successfully compiled the program. I was expecting to see a compile error or a runtime error, or at least the values of a and b swapped (as 5 > 1), since the std::sort() function is being called with two pointers to integers.
(Please note that I know this is not a good practice and I was basically just playing with pointers)
#include <iostream>
#include <algorithm>
int main() {
int a{5};
int b{4};
int c{1};
int* aptr = &a;
int* bptr = &b;
std::sort(aptr, bptr);
std::cout << a << ' ' << b << ' ' << c << '\n';
return 0;
}
However, upon executing the program, the output I got was this:
5 4 1
My question is, how did C++ allow this call to the std::sort() function? And how did it not end up actually sorting everything between the memory addresses of a and b (potentially including even garbage values in memory)?
I mean, if we tried this with C-style arrays like this (std::sort(arr, arr+n)) it would successfully sort the C-style array, because arr and arr+n are basically just pointers where n is the size of the array and arr is the pointer to the first element.
(I'm sorry if this question sounds stupid. I'm still learning C++.)
Your program is ill formed, no diagnostic required. You passed pointers that do not form a range to a std algorithm.
Any behaviour whatsoever by the program is conforming to the C++ standard.
Compilers optimize around the fact that pointers to unrelated objects are incomparable and their difference is undefined. A sort here would trip over so much UB the optimizer could eliminate branches like crazy (as any branch with UB can be eliminated and replaced with the alternative (whatever code the alternate branch is a legal result of UB)).
Good C++ coding style thus focuses on avoiding UB and IL-NDR code.
C++ accepts your code as it is syntactically right. But it doesn't work because sort(it1, it2) expects it1 one to be some starting position of an array and it2 to be the ending position of the same array. you have provided two different arrays to the sort function which can yield any of two following situations:
positionof(it1) < positionof(it2): suppose in computer's memory array a and b are stored in the like this- 5(a), -1, -2, 10, 4(b). then the sort function will sort from 5 to 4 resulting in : -2(a),-1,4,5,10(b).
positionof(it1) > positionof(it2) (your machine's case): the sort function will do nothing as left_position > right_position.

How to make uninitiated pointer not equal to 0/null?

I am a c++ learner. Others told me "uninitiatied pointer may point to anywhere". How to prove that by code.?I made a little test code but my uninitiatied pointer always point to 0. In which case it does not point to 0? Thanks
#include <iostream>
using namespace std;
int main() {
int* p;
printf("%d\n", p);
char* p1;
printf("%d\n", p1);
return 0;
}
Any uninitialized variable by definition has an indeterminate value until a value is supplied, and even accessing it is undefined. Because this is the grey-area of undefined behaviour, there's no way you can guarantee that an uninitialized pointer will be anything other than 0.
Anything you write to demonstrate this would be dictated by the compiler and system you are running on.
If you really want to, you can try writing a function that fills up a local array with garbage values, and create another function that defines an uninitialized pointer and prints it. Run the second function after the first in your main() and you might see it.
Edit: For you curiosity, I exhibited the behavior with VS2015 on my system with this code:
void f1()
{
// junk
char arr[24];
for (char& c : arr) c = 1;
}
void f2()
{
// uninitialized
int* ptr[4];
std::cout << (std::uintptr_t)ptr[1] << std::endl;
}
int main()
{
f1();
f2();
return 0;
}
Which prints 16843009 (0x01010101). But again, this is all undefined behaviour.
Well, I think it is not worth to prove this question, because a good coding style should be used and this say's: Initialise all variables! One example: If you "free" a pointer, just give them a value like in this example:
char *p=NULL; // yes, this is not needed but do it! later you may change your program an add code beneath this line...
p=(char *)malloc(512);
...
free(p);
p=NULL;
That is a safe and good style. Also if you use free(p) again by accident, it will not crash your program ! In this example - if you don't set NULL to p after doing a free(), your can use the pointer by mistake again and your program would try to address already freed memory - this will crash your program or (more bad) may end in strange results.
So don't waste time on you question about a case where pointers do not point to NULL. Just set values to your variables (pointers) ! :-)
It depends on the compiler. Your code executed on an old MSVC2008 displays in release mode (plain random):
1955116784
1955116784
and in debug mode (after croaking for using unitialized pointer usage):
-858993460
-858993460
because that implementation sets uninitialized pointers to 0xcccccccc in debug mode to detect their usage.
The standard says that using an uninitialized pointer leads to undefined behaviour. That means that from the standard anything can happen. But a particular implementation is free to do whatever it wants:
yours happen to set the pointers to 0 (but you should not rely on it unless it is documented in the implementation documentation)
MSVC in debug mode sets the pointer to 0xcccccccc in debug mode but AFAIK does not document it (*), so we still cannot rely on it
(*) at least I could not find any reference...

two short questions about pointers and references

consider this code:
double *pi;
double j;
pi = &j;
pi[3] = 5;
I don't understand how is that possible that I can perform the last line here.
I set pi to the reference of j, which is a double variable, and not a double [] variable. so how is this possible that I can perform an array commands on it?
consider this code:
char *c = "abcdefg";
std::cout << &(c[3]) << endl;
the output is "defg". I expected that I will get a reference output because I used &, but instead I got the value of the char * from the cell position to the end. why is that?
You have two separate questions here.
A pointer is sometimes used to point to an array or buffer in memory. Therefore it supports the [] syntax. In this case, using pi[x] where x is not 0 is invalid as you are not pointing to an array or buffer.
Streams have an overload for char pointers to treat them as a C-style string, and not output their address. That is what is happening in your second case. Try std::cout << static_cast<const void *>(&(c[3])) << endl;
Pointers and arrays go hand in hand in C (sort of...)
pi[3] is the same as *(pi + 3). In your code however this leads to Undefined Behavior as you create a pointer outside an object bounds.
Also be careful as * and & are different operators depending on in which kind of expression the appear.
That is undefined behavior. C++ allows you to do things you ought not to.
There are special rules for char*, because it is often used as the beginning of a string. If pass a char* to cout, it will print whatever that points to as characters, and stop when it reaches a '\0'.
Ok, so a few main things here:
A pointer is what it is, it points to a location in the memory. So therefore, a pointer can be an array if you whish.
If you are working with pointers (dangerous at times), this complicates things. You are writing on p, which is a pointer to a memory location. So, even though you have not allocated the memory, you can access the memory as an array and write it. But this gives us the question you are asking. How can this be? well, the simple answer is that you are accessing a zone of memory where the variable you have created has absolutely no control, so you could possibly be stepping on another variable (if you have others) or simply just writting on memory that has not been used yet.
I dont't understand what you are asking in the second question, maybe you could explain a little more? Thanks.
The last line of this code...
double *pi;
double j;
pi = &j;
pi[3] = 5;
... is the syntactic equivalent to (pi + 3) = 5. There is no difference in how a compiler views a double[] variable and a double variable.
Although the above code will compile, it will cause a memory error. Here is safe code that illustrates the same concepts...
double *pi = new double[5]; // allocate 5 places of int in heap
double j;
pi[3] = 5; // give 4th place a value of 5
delete pi; // erase allocated memory
pi = &j; // now get pi to point to a different memory location
I don't understand how is that possible that I can perform the last
line here. I set pi to the reference of j
Actually, you're setting your pointer pi, to point to the memory address of j.
When you do pi[3], you're using a non-array variable as an array. While valid c++, it is inherently dangerous. You run the risk of overwriting the memory of other variables, or even access memory outside your process, which will result in the operating system killing your program.
When that's said, pi[3] means you're saying "give me the slot third down from the memory location of pi". So you're not touching pi itself, but an offset.
If you want to use arrays, declare them as such:
double pi[5]; //This means 5 doubles arrayed aside each other, hence the term "array".
Appropos arrays, in c++ it's usually better to not use raw arrays, instead use vectors(there are other types of containers):
vector<double> container;
container.push(5.25); //"push" means you add a variable to the vector.
Unlike raw arrays, a container such as a vector, will keep it's size internally, so if you've put 5 doubles in it, you can call container.size(), which will return 5. Useful in for loops and the like.
About your second question, you're effectively returning a reference to a substring of your "abcdefg" string.
&([3]) means "give me a string, starting from the d". Since c-style strings(which is what char* is called) add an extra NULL at the end, any piece of code that takes these as arguments(such as cout) will keep reading memory until they stumble upon the NULL(aka a 0). The NULL terminates the string, meaning it marks the end of the data.
Appropos, c-style strings are the only datatype that behaves like an array, without actually being one. This also means they are dangerous. Personally I've never had any need to use one. I recommend using modern strings instead. These newer, c++ specific variables are both safe to use, as well as easier to use. Like vectors, they are containers, they keep track of their size, and they resize automatically. Observe:
string test = "abcdefg";
cout<<test.size()<<endl;//prints 7, the number of characters in the array.
test.append("hijklmno");//appends the string, AND updates the size, so subsequent calls will now return 15.

why does this fail

i am stuck and unable to figure out why this is the following piece of code is not running .I am fairly new to c/c++.
#include <iostream>
int main(){
const char *arr="Hello";
const char * arr1="World";
char **arr2=NULL;
arr2[0]=arr;
arr2[1]=arr1;
for (int i=0;i<=1;i++){
std::cout<<arr2[i]<<std::endl;
}
return 0;
}
where as this is running perfectly fine
#include <iostream>
int main(){
const char *arr="Hello";
const char * arr1="World";
char *arr2[1];
arr2[0]=arr;
arr2[1]=arr1;
for (int i=0;i<=1;i++){
std::cout<<arr2[i]<<std::endl;
}
return 0;
}
Why is this? and generally how to iterate over a char **?
Thank You
char *arr2[1]; is an array with one element (allocated on the stack) of type "pointer to char". arr2[0] is the first element in that array. arr2[1] is undefined.
char **arr2=NULL; is a pointer to "pointer to char". Note that no memory is allocated on the stack. arr2[0] is undefined.
Bottom line, neither of your versions is correct. That the second variant is "running perfectly fine" is just a reminder that buggy code can appear to run correctly, until negligent programming really bites you later on and makes you waste hours and days in debugging because you trashed the stack.
Edit: Further "offenses" in the code:
String literals are of type char const *, and don't you forget the const.
It is common (and recommended) practice to indent the code of a function.
It is (IMHO) good practice to add spaces in various places to increase readability (e.g. post (, pre ), pre and post binary operators, post ; in the for statement etc.). Tastes differ, and there is a vocal faction that actually encourages leaving out spaces wherever possible, but you didn't even do that consistently - and consistency is universially recommended. Try code reformatters like astyle and see what they can do for readability.
This is not correct because arr2 does not point to anything:
char **arr2=NULL;
arr2[0]=arr;
arr2[1]=arr1;
correct way:
char *arr2[2] = { NULL };
arr2[0]=arr;
arr2[1]=arr1;
This is also wrong, arr2 has size 1:
char *arr2[1];
arr2[0]=arr;
arr2[1]=arr1;
correct way is the same:
char *arr2[2] = { NULL };
arr2[0]=arr;
arr2[1]=arr1;
char **arr2=NULL;
Is a pointer to a pointer that points to NULL while
char *arr2[1];
is an array of pointers with already allocated space for two items.
In the second case of the pointer to a pointer you are are trying to write data in a memory location that does not exist while in the first place the compiler has already allocated two slots of memory for the array so you can assign values to the two elements.
If you think of it very simplistically, a C pointer is nothing but an integer variable, whose value is actually a memory address. So by defining char *x = NULL you are actually defining a integer variable with value NULL (i.e zero). Now suppose you write something like *x = 5; This means go to the memory address that is stored inside x (NULL) and write 5 in it. Since there is no memory slot with address 0, the the entire statement fails.
To be honest it;s been ages since I last had to deal with such stuff however this little tutorial here, might clear the motions of array and pointers in C++.
Put simply the declaration of a pointer does NOT reserve any memory, where as the declration of a array doesn't.
In your first example
Your line char **arr2=NULL declares a pointer to a pointer of characters but does not set it to any value - thus it is initiated pointing to the zero byte (NULL==0). When you say arr2[0]=something you are attempting to place a valuei nthis zero location which does not belong to you - thus the crash.
In your second example:
The declaration *arr2[2] does reserve space for two pointers and thus it works.

Syntax for pointer to portion of multi-dimensional statically-allocated array

Okay, I have a multi-dimensional array which is statically-allocated. I'd very much like to get a pointer to a portion of it and use that pointer to access the rest of it. Basically, I'd like to be able to do something like this:
#include <stdio.h>
#include <string.h>
#define DIM1 4
#define DIM2 4
#define DIM3 8
#define DIM4 64
static char theArray[DIM1][DIM2][DIM3][DIM4] = {0};
int main()
{
strcpy(theArray[0][0][0], "hello world");
char** ptr = theArray[0][0];
printf("%s\n", ptr[0]);
return 0;
}s
This code results in this error using gcc:
t.c: In function ‘main’:
t.c:17: warning: initialization from incompatible pointer type
char** is apparently not the correct type to use here. I assume it's because statically-allocated arrays are created as a single block in memory while dynamically allocated ones are separated in memory, with each dimension having a pointer to the next one.
However, as you likely noticed, the number of dimensions here is painfully large, and I'm obviously going to need to use actual variables to index the array rather than the nice, slim character 0, so it's going to get painfully long to index the array in actual code. I'd very much like to have a pointer into the array to use so that accessing the array is much less painful. But I can't figure out the correct syntax - if there even is any. So, any help would be appreciated. Thanks.
theArray[0][0] gets rid of the first two dimensions, so you have something of type char [DIM3][DIM4]. The first dimension will drop out when the array decays to a pointer, so the declaration you want is:
char (*ptr)[DIM4] = theArray[0][0];
For what it's worth, gcc also displays a warning for your array declaration: "warning: missing braces around initializer". Static variables and global variables will automatically be initialized to 0, so you can fix the warning by getting rid of the initializer:
static char theArray[DIM1][DIM2][DIM3][DIM4];
Given a declaration
T arr[J][K][L][M];
the following all hold:
Expression Type Decays to
---------- ---- ---------
arr T [J][K][L][M] T (*)[K][L][M]
&arr T (*)[J][K][L][M]
arr[j] T [K][L][M] T (*)[L][M]
&arr[j] T (*)[K][L][M]
arr[j][k] T [L][M] T (*)[M];
&arr[j][k] T (*)[L][M]
arr[j][k][l] T [M] T *
&arr[j][k][l] T (*)[M]
So, in your case, the type of ptr needs to be char (*)[DIM4].
If you are using Visual Studio, a nice way to find out the type of the expression is to use typeid.
cout << typeid(&theArray[0][0]).name();
which prints
char (*)[8][64]
But note that output of typeid().name() is an implementation specific behavior and as such can not be relied upon. But VS happens to be nice in this regards by printing a more meaningful name
Does not matter that it is static, your types don't match
Why a double pointer can't be used as a 2D array?
This is a good example, although the compiler may not complain,
it is wrong to declare: "int **mat" and then use "mat" as a 2D array.
These are two very different data-types and using them you access
different locations in memory. On a good machine (e.g. VAX/VMS) this
mistake aborts the program with a "memory access violation" error.
This mistake is common because it is easy to forget that the decay
convention mustn't be applied recursively (more than once) to the
same array, so a 2D array is NOT equivalent to a double pointer.
A "pointer to pointer of T" can't serve as a "2D array of T".
The 2D array is "equivalent" to a "pointer to row of T", and this
is very different from "pointer to pointer of T".
When a double pointer that points to the first element of an array,
is used with subscript notation "ptr[0][0]", it is fully dereferenced
two times (see rule #5). After two full dereferencings the resulting
object will have an address equal to whatever value was found INSIDE
the first element of the array. Since the first element contains
our data, we would have wild memory accesses.
...
etc....
http://www.ibiblio.org/pub/languages/fortran/append-c.html