C++ what happens when incrementing a char * - c++

Say I have the following code:
void incrementPointer( const char *x) {
char *localVar = new char;
char *localVarPtr = localVar;
while(*xPtr != '\0') {
xPtr++;
localVarPtr++;
}
}
Let's say that x is pointing to some null-terminated word. After the execution of the while loop, is the localVarPtr pointing to a location that has not been allocated for localVar? For instance, if I declare some other variable in this function, and then set all bytes between localVar and localVarPtr to the character 'c', would this potentially overwrite the value of the other variable?
My other question is if this is considered bad practice (i.e. potentially overwriting variables or causing undefined behavior), what would be the way to allocate enough space for localVar if x is pointing to a word who's size is unlimited? The size of x may be larger than an unsigned integer, and thus I would not be able to use the size in the initialization of localVar.

Let's say that x is pointing to some null-terminated word. After the execution of the while loop, is the localVarPtr pointing to a location that has not been allocated for localVar?
That depends entirely on the length of x - If for example x contains only the null-terminator then no, the value of localVarPtr points to the beginning of the region allocated by new.
For instance, if I declare some other variable in this function, and then set all bytes between localVar and localVarPtr to the character 'c', would this potentially overwrite the value of the other variable?
You're not mutating the memory stored in any of the addresses so no, an overwrite would not occur. If your localVarPtr points to the region at say address 0x8000 and you then modify 0x8010, assuming that 0x0810 is a "local variable" then yes, its contents would be changed. This is the wild west, any valid (or sometimes invalid) address that is dereferenced and assigned to will change or signal.
My other question is if this is considered bad practice (i.e. potentially overwriting variables or causing undefined behavior),
Yes, typically pointing to or modifying undefined memory is bad practice. A pointer to a local variable, intentionally set that way however, is perfectly fine. Undefined behavior is always bad practice.
what would be the way to allocate enough space for localVar if x is pointing to a word who's size is unlimited? The size of x may be larger than an unsigned integer, and thus I would not be able to use the size in the initialization of localVar.
As Mat denoted, there is no such way to have an "unlimited" length string. If you need to allocate space to match the size of the string for x, you'll have to simply know the size either by calling a function to determine its length, or passing it in to this function. Once you know the size, you can use new to allocate for it.

Related

Can anybody explain why *var=i is valid [duplicate]

char a[] = "hello";
My understanding is that a acts like a constant pointer to a string. I know writing a++ won't work, but why?
No, it's not OK to increment an array. Although arrays are freely convertible to pointers, they are not pointers. Therefore, writing a++ will trigger an error.
However, writing
char *p = a;
p++;
is fine, becuase p is a pointer, with value equal to the location of a's initial element.
a++ is not well-formed since a decays to a pointer, and the result of the decay is not an lvalue (so there is no persistent object whose state could be "incremented").
If you want to manipulate pointers to the array, you should first create such a pointer:
char* p = a; // decayed pointer initializes p
a++; // OK
++a; // even OKer
This is a very good question actually. Before discussing this, let's back to the basic concepts.
What happens when we declare a variable ?
int a=10;
Well, we get a memory location to store the variable a. Apart from this an entry is created into Symbol table that contains the address of the variable and the name of the memory location (a in this case).
Once the entry is created, you can never change anything into the symbol table, means you can't update the address. Getting an address for a variable is not in our hand, it's done by our computer system.
Let's say, we get address 400 for our variable a.
Now computer has assigned an address for the variable a, so at a later point, we can't ask computer to change this address 400 because again, it's not in our hand, our computer system does it.
Now you have an idea about what happens when we declare a variable.let's come to our question.
Let's declare an array.
int arr[10]
So, when we declare this array, we create the entry into the symbol table and, store the address and the name of the array into the symbol table.
let's assume we get address 500 for this variable.
Let's see what happens when we want to do something like this :
arr++
when we increment arr, we want to increment 500, that is not possible and not in our hand, because it has been decided by the computer system, so we can't change it.
instead of doing this we can declare a pointer variable
int * p= &arr;
What happens in this situation is: again an entry is created into the symbol table that stores the name and the address of the pointer variable p.
So when we try to increment p by doing p++, we are not changing the value into the symbol table, instead we are changing the value of the address of the pointer variable, that we can do and we are allowed to do.
Also it's very obvious that if we will increment the a the ultimately we are going to loss the address of our array. if we loss the address of array then how will we access the array at a later point ?
It is never legal in C to assign to an expression of array type. Increment (++) involves assignment, and is thus also not legal.
What you showed at the top is a special syntax for initializing a char array variable.
I think this answer here explains "why" it's not a good idea;
It's because array is treated as a constant pointer in the function it is declared.
There is a reason for it. Array variable is supposed to point to the first element of the array or first memory instance of the block of the contiguous memory locations in which it is stored. So, if we will have the liberty to to change(increment or decrement ) the array pointer, it won't point to the first memory location of the block. Thus it will loose it's purpose.

Why uninitialized pointers cause mem access violations close to 0?

It is said that often (but not always) when you get an AV in a memory location close to zero (like $89) you have an uninitialized pointer.
But I have seen this also in Delphi books... Hm... or they have been all written by the same author(s)???
Update:
Quote from "C++ builder 6 developers guide" by Bob Swart et all, page 71:
When the memory address ZZZZZZZZZ is close to zero, the cause is often
an uninitialized pointer that has been accessed.
Why is it so? Why uninitialized pointers contain low numbers? Why not big numbers like $FFFFFFF or plain random numbers? Is this urban myth?
This is confusing "uninitialized pointers" with null references or null pointers. Access to an object's fields, or indexes into a pointer, will be represented as an offset with respect to the base pointer. If that reference is null then the offsets will generally be addresses either near zero (for positive offsets) or addresses near the maximum value of the native pointer size (for negative offsets).
Access violations at addresses with these characteristic small (or large) values are a good clue that you have a null reference or null pointer, specifically, and not simply an uninitialized pointer. An uninitialized reference can have a null value, but may also have any other value depending on how it is allocated.
Why uninitialized pointers contain low numbers?
They don't. They can contain any value.
Why not big numbers like $FFFFFFF?
They can perfectly well contain values like $FFFFFFF.
or plain random numbers?
Uninitialised variables tend not to be truly random. They typically contain whatever happened to have been written to that memory location the last time it was used. For instance, it is very common for uninitialised local variables to contain the same value every time a function is called because the history of stack usage happens to be repeatable.
It's also worth pointing out that random is an often misused word. People often say random when they actually mean distributed randomly with uniform distribution. I expect that's what you meant when you used the term random.
Your statement about AV close to zero is true for dereferencing a null pointer. It is zero or close to zero because you either dereference the null pointer:
int* p{};
const auto v = *p; // <-- AV at memory location = 0
or access an array item:
char* p{};
const auto v = p[100]; // <--AV at memory location = 100
or a struct field:
struct Data
{
int field1;
int field2;
};
Data* p{};
const auto v = p->field2; // AV at memory location = 4

Array values at negative one

Are there any repricussions having a value in an array stored at -1? could it affect the program or computer in a bad way? I am really curious, I'm new to programming and any clarification I can get really helps, thanks.
There's no way to store anything in an array object at index -1. A mere attempt to obtain a pointer to that non-existing element results in undefined behavior.
Negative indices (like -1) may appear in array-like contexts in situations when base pointer is not the array object itself, but rather an independent pointer pointing into the middle of another array object, as in
int a[10];
int *p = &a[5];
p[-1] = 42; // OK, sets `a[4]`
p[-2] = 5; // OK, sets `a[3]`
But any attempts to access non-existent elements before the beginning of the actual array result in undefined behavior
a[-1]; // undefined behavior
p[-6]; // undefined behavior
You see if you are trying to take element of array by pointing some value in brackets, basically you're specifying offset (multiplied by size of allocated type) from memory address. If you've allocated array in a typical way like int *a = new int[N], memory you're allowed to use is limited from address a until a + <size of memory allocated> (which in this case sizeof (int) * N), so by trying to get value with index -1 you are getting out of bounds of your array and it certainly will lead you to error or possible program crash.
There's of course a chance that your memory pointer is not the one at the beginning of some allocated sequence, like (considering previous example) int *b = a + 1, in this case you may (at least compiler allows that) take value of a[-1] and it would be valid, but since it's pretty hard to manage correctness of code like this I would still recommend against it.

How can char * name = "Duncan"; be valid if pointers can only hold addresses?

I thought that pointers can only hold addresses to other variables. So how can the following statement that I came across be valid? It's holding a string.
char * name = "Duncan"
Thanks.
It's holding a pointer to a string. That's not the same. name just contains an address of memory which contains the string.
"Duncan" is a null terminated string and as such an array of char ({'D', 'u', 'n', 'c', 'a', 'n', '\0'}). char*name="Duncan"; sets name to the address of the array.
Your statement is OK in C, but in C++ "Duncan" is a const char array, so you should use const char *name = "Duncan".
BTW, if you do not need to change the pointer variable name, it's better to have const char name[] = "Duncan". This only allocates memory for the string. Your sample code allocates memory for the string and for the pointer variable name. (Of course the compiler might optimize away name.)
It's still pointing to a string. The string gets put in memory first, and name points to that. It's compiled into your program, so it may not be obvious.
pointers can only hold addresses to other variables.
This is incorrect: references hold addresses of other variables; pointers can hold addresses of anything, or even nothing in particular (e.g. NULL).
In this case, name holds an address of a memory block of 7 bytes, containing ASCII codes for D,u,n,c,a,n, and \0.
In this particular case, the compiler will store the array with data Duncan\0 somewhere in the object file and the pointer will point there.
So yes, the pointer is only holding an address. The data are somewhere else.
This brings me to saying, writing code like this is not so good. For example, if you change that string through your pointer, you get an undefined behavior.
That's a definition of a char pointer. After the definition, on the right side of "=", you have a constant definition. The constant is stored somewhere in memory and its address is used as first value for "name".
Later on you will be able to assign other value to "name". You are not bound to the first value, in fact "name" is a variable.

Why does binary saving of array in a file works? [C++]

C++ newbie here.
I'm trying to figure out the following line that writes a buffer into a file:
fOut.write((char *)&_data, sizeof(_data));
_data = array of integers...
I have a couple of questions:
Is &_data the address to the first element of the array?
Whatever address it is, does that mean that we only save the address of the array? then how come I still can access the array after I free it?
Shouldn't I pass sizeof(_data)*arrLength? what is the meaning of passing the size of int (in this case) and not the size of the entire array?
What does casting into char* mean when dealing with addresses?
I would really appreciate some clarifications.
Contrary to your comment, the array must be automatic or static storage duration in order for sizeof to work.
It should be just (char*)_data. The name of an array implicitly converts to a pointer to the first element.
No, write expects a pointer, and stores the content found at that location, not the location's address.
No. Since _data is an array, sizeof (_data) is the cumulative size of all elements in the array. If _data were a pointer (such as when an array is dynamically allocated on the heap), you would want numElems * sizeof(_data[0]). Multiplying the size of a pointer by the number of elements isn't helpful.
It means that the content at that address will be treated as a series of individual bytes, losing whatever numeric meaning it might have had. This is often done to perform efficient bulk copy of data, either to and from a file, or with memcpy/memmove. The data type should be POD (plain old data) or you'll get unexpected results.
If _data is a pointer to an array allocated from the heap, as your comment suggests, then the code is badly broken. In that case, you are saving just the address, and it may appear to work if you load the file back into the same instance of your program, but that's just because it's finding the data still in memory at the same address. The data wouldn't actually be in the file, and if you re-started the program before loading the file, you'd find that the data was gone. Make the changes I mentioned in both (1) and (3) in order to save the complete array regardless of whether it's allocated automatic, static, or dynamically.
What does casting into char* means
when dealing with addresses?
Imagine this simple example
int x = 12;
char * z = (char *)&x;
And assume an architecture where int is 4 bytes long. From the C++ Standard sizeof(char)==1.
On the expression char * z the char * part, you could say that is being used for pointer arithmetic
on the Second line of the example I gave, what happens is that z now points to the first (out of 4 bytes) that x has. Doing a ++z; will make z point to the Second Byte of the (in my example) 4byte int
You could say that the left part of a declaration is used for pointer arithmetic, to simplify things. a ++(char *) would move you by one byte, while a ++(int *) would move you by the corresponding number of bytes int occupies on the memory.
Yes
No, write uses this address as the first location, and reads through sizeof(_data) writing the whole array
sizeof(_data) will return the size of the entire array not the same as sizeof(int)
Means the data will be read byte by byte, this is the pointer required by write as it writes in binary format(byte by byte)
1) Yes, &_data is the address of the first element of your array.
2) No, write() writes the number bytes you have specified via sizeof(_data) starting at address &_data
3) You would pass sizeof(int)*arrLength if _data is a pointer to an array, but since it is an array sizeof() returns the correct size.
4) don't know. ;)
read this : http://www.cplusplus.com/reference/iostream/ostream/write/
should be.
if you call "fstream.write(a,b)" then it writes b bytes starting from location a into the file (i.e. what the address is pointing at);
it should be the size in bytes or chars .
not much, similar to casting stuff to byte[] in more civilized languages.
By the way, it will only work on simple arrays with simple values inside them...
i suggest you look into the >> << operators .
is &_data the address to the first element of the array?
Yes, it is. This is the usual way to pass a "reference" to an array in C and C++. If you passed the array itself as a parameter, the whole array contents would be copied, which is usually wasteful and unnecessary. Correction: You can pass either &_data, or just _data. Either way, the array does not need to be copied to the stack.
whatever address it is, does that mean that we only save the address of the array? then how come I still can access the array after I delete him from the memory?
No, the method uses the address it gets to read the array contents; just saving the memory address would be pointless, as you point out.
Shouldn't I pass sizeof(_data)*arrLength? I mean...
what is the logic of passing the size
of int (in this case) and not the size
of the entire array?
No, sizeof(_data) is the size of the array, not of one member. No need to multiply by length.
What does casting into char* means when dealing with addresses?
Casting to char* means that the array is accessed as a list of bytes; that's necessary for accessing and writing the raw values.