What is the difference between following two snippets? - c++

char *t=new char();
and
char *t=new char[102];
as my code was accepted by using the latter one?
//BISHOPS SPOJ

char *t=new char();
Allocates memory for a single character, and calls the default constructor.
char *t=new char[102];
Creates an array of 102 chars and calls the default constructor.
As the default constructor for POD types is nothing, the difference is the amount of memory allocated (single char vs array of char)

Actually both are pointers on char, but second is pointer to char array.
Which allows you to store 102 characters into the array.
char *t=new char[102];
0 1 2 3 101
+---+---+---+---+ ... +---+
| | | | | |
+---+---+---+---+ ... +---+
^
|
-----------------------------+---+
| * |
+---+
t
It allows you to dereference these indexes 0-101.
While first one allows you to store only one character.
char *t=new char();
0
+---+
| |
+---+
^
|
-----------------------------+---+
| * |
+---+
t
Where dereferencing other index than 0 would lead to access outside of bounds and undefined behavior.
Deleting
To delete an character char *t=new char();
delete t;
Where to delete an array char *t=new char[102]; you have to write empty brackes, to explicitly say its an array.
delete [] t;
Same with these codes
char *t = new char[10]; // Poitner to array of 10 characters
char *t = new char(10); // Pointer to one character with value of 10
Memory initialialization
char *t = new char(); // default initialized (ie nothing happens)
char *t = new char(10); // zero initialized (ie set to 0)
Arrays:
char *t = new char[10]; // default initialized (ie nothing happens)
char *t = new char[10](); // zero initialized (ie all elements set to 0)

Related

C++ and C-string variables, char array size

I've searched through many articles on here and have self-tested this concept with my own code. My question is to satisfy my own curiosity and maybe help others as I cannot find a answer describing this concept in particular. My textbook (teaching C++) describes a C-string variable:
A C-string variable is an array of characters. The following array declaration provides a C-string variable "s" capable of storing a
C-string value with nine or fewer characters:
char s[10];
The 10 is for the nine letters in the string plus the null character '\0' to mark the end of the string. Like any other partially filled
array, a C-string variable uses positions starting at indexed variable
0 through as many as are needed.
I'm trying to understand the above. If the array size is 10, wouldn't the total storage size be 11? i.e. 0-10 = 11 spaces. If the \0 character occupies one space, then we'd still be able to store 10 characters and not 9 as per the book.
In my own testing, I declared a character array test[4] and stored the word "cat" in the array. When looking at individual positions within the array, I can see individual characters at each index i.e:
test[0] = c
test[1] = a
test[2] = t
test[3] =
test[4] =
Why do we need 2 additional slots in the character array and not 1?
An array with size N has indexes starting at 0 and ending with N-1. It does not have an element with index N.
With your example of char test[4], the array has indexes 0, 1, 2, and 3. Attempting to access index 4 is going off the end of the array. C and C++ do no prevent you from doing so, and attempting to do so invokes undefined behavior.
You can look at an array as a group of variables of the same type and size which are consecutive in memory one next to the other.
Arrays are indexed from 0 as the first element to the n - 1 as the last element. So you can access any element just using an index.
Trying to access an array with an index i >= n or a negative index i < 0 Will issue in undefined behavior.
Arrays of characters need to set the last element as a NULL character \0.
Here is an example:
char c[5] = "Hello"; // Error
Above c has 5 elements and \0 so it is 6 Byte long. So to correct it:
char c[6] = "Hello"; // Null character added automatically
// char c[] = "Hello";
Look at this example:
char text[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
Above in a such initialization you must add the null terminator character '\0' otherwise you'll get a garbage characters at the end of your string.
std::cout << text[0]; // H which is the first element
std::cout << text[6 - 1 - 1]; // o which is the last character in the array.
arrays of other types other than characters need not to add a null terminator and the number of elements is n but indexing is the same 0 through n - 1;
int array[5] = {4, 5, 9, 22, 16};
std::cout << array[0]; // 4
std::cout << array[5 - 1]; // 16
I think your problem comes from the misconception of how you access the memory.
s[n] refers at accessing the value pointed by the pointer s plus n blocks in memory, it can also be written *(s + n)
So basically, declaring
char s[4];
and setting cat in it, you will get this layout in memory
+---+
s: | c | s[0] also (s + 0)
+---+
| a | s[1] also (s + 1)
+---+
| t | s[2] also (s + 2)
+---+
|\0 | s[3] also (s + 3)
+---+
| ? | s[4] also (s + 4)
+---+
| ? | s[5] also (s + 5)
+---+
| ? | s[6] also (s + 6)
+---+
| ? | s[7] also (s + 7)
+---+
| ? | s[8] also (s + 8)
+---+
| ? | s[9] also (s + 9)
+---+
The ? stand for a variable which we aren't sure of the value.
You CAN access it, you can even modify it sometimes.
But the behavior of this isn't clear and can be undefined.
For your example s[4] can change each time the executable is executed.
char s[10]; had valid indices from 0 to 9 and not 0 to 10 as you said.
A C-string variable is an array of characters.
This is slightly misleading. A C string is a sequence of character values followed by a 0-valued terminator. For example, the string "Hello" is represented as the sequence {'H','e', 'l', 'l', 'o', 0}. The presence of the 0 terminator makes the character sequence a string. All C string handling functions (strcat, strcmp, strcpy, strchr, etc.) assume the presence of that terminator; if the terminator isn't there, those routines will not function properly.
Strings are stored in arrays of character type (char for ASCII, EBCDIC, or UTF-8 strings or wchar_t for "wide" strings1). Multiple strings may be stored in a single array if there's sufficient space. A 10-element array may store a single 9-character string, or two 4 character strings, or 5 one-character strings. Remember that to store an N-character string you need N+1 array elements to account for the terminator.
I'm trying to understand the above. If the array size is 10, wouldn't the total storage size be 11? i.e. 0-10 = 11 spaces.
Total storage size is 10, but array elements are indexed from 0 to 9. Given the declaration
char foo[10];
you get the following layout in memory:
+---+
foo: | | foo[0]
+---+
| | foo[1]
+---+
| | foo[2]
+---+
| | foo[3]
+---+
| | foo[4]
+---+
| | foo[5]
+---+
| | foo[6]
+---+
| | foo[7]
+---+
| | foo[8]
+---+
| | foo[9]
+---+
For any N-element array, individual elements are indexed from 0 through N-1.
Remember that in C, the array subscript operation a[i] is defined as *(a + i) - given a starting address a2, offset i elements (not bytes!!!) from that address and dereference the result. The first element is stored at a, the second element is stored at a + 1, the third at a + 2, etc.
wchar_t was introduced to represent character sets outside the ranges defined by ASCII or EBCDIC, so it's wider than the `char` type (often the width of two `char`s). With the advent of schemes like UTF-8 to represent non-English character sets, it's not that useful and I don't see it used very often.
At some point, you're going to hear someone say "an array is just a pointer". This is not correct. Under most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type, and the value of the expression will be the address of the first element in the array. The array object itself is not a pointer, nor does it set aside any space for a pointer value.
char test[4] is the definition of your array, it means your array's size is four, but you can only use the spaces: test[0], test[1], test[2], test[3].
You can go to http://www.cplusplus.com/doc/tutorial/arrays/ to learn more about Arrays of c++.

Issue with a char* array

Okay so I have:
char* arr[5];
and I have
char input[10];
and then:
int i = 0;
cin.getline(input, 10);
while(input[0] != 'z')
{
arr[i] = input;
cin.getline(input, 10);
i++;
}
the result is that every element in arr is the same because they are all pointers to input, but I want each element to point to char arrays that the input variable held at that given time.
result that this brings:
arr[0] = line beginning in 'z' (because that is what input currently is holding)
arr[1] = line beginning in 'z'
arr[2] = ... and so on
result that I want:
arr[0] = first line read in
arr[1] = second line read in
arr[2] = third line read in and so on...
I am confused about how I can get the elements to all point to new values instead of all be pointing to the same value.
If you want to take input 5 times,you can try this code segment
int i = 0;
while(i<5)
{
arr[i] = input;
cin.getline(input, 10);
i++;
}
this should work and you can get your "desired result" as you stated above.
Edit:
This will not work, as described in the comment. See example here: http://ideone.com/hUQGa7
What is required are different pointer values occupying each of the elements in arr. How to achieve those different pointer values is discussed in the other answers given.
Let's talk characters and pointers.
Given a character:
+---+
+ A +
+---+
A pointer to the character, char *, points to the character. That's it, nothing more.
An array of characters is a container that has slots for characters. A pointer to the array often points to the first character of the array. Here's where the problem comes in.
Many functions require a pointer, to the first character of the array, but either don't say that it's to the first character or require a pointer to a single character, assuming that the pointer is the first character in the array.
Pointers need something to Point at
You have allocated an array of pointers, but haven't allocated the memory for each pointer:
Array
+-----+ +----+---+---+
| | --> | | | |
| | +----+---+---+
+-----+
| | +----+---+---+
| | --> | | | |
| | +----+---+---+
+-----+
The content of the array are pointer. So you will need to allocate memory and place the pointer into the array:
arr[0] = new char [11]; // +1 for the nul terminator character.
char text[33];
arr[1] = text;
So what you aim to do is:
cin.getline(arr[0], 10);
cin.getline(arr[1], 33);
Strings are soooo much easier to deal with. They manage their own memory:
std::string array_text[5]; // An array of 5 string.
getline(cin, array_text[0]);
The next step is to use a vector and you are all caught up:
std::vector< std::string > text_vector(5); // Pre-allocate 5 slots.
getline(cin.text_vector[2]);
You have to copy the data of input everytime.
You can do like this in while loop:
while(input[0] != 'z')
{
arr[i] = new char[strlen(input)];
strcpy(arr[i], input);
cin.getline(input, 10);
i++;
}
And since you have defined arr to be of length 5. So you have to check in the while loop that i doesn't exceed the value 5. i<5.

Questions about pointers and what is/is not a pointer

Questions regarding, well, ultimately pointers to pointers (I suspect). Please read the questions posed in the commented code:
void doodah(char* a);
int main() {
char d[] = "message"; // one way of assigning a string.
// char* d = "message"; // another way of assigning a string, but REM'ed out for now.
cout << d << endl; // d appears not to be a pointer because cout outputs "message", and not an address. Why is this?
doodah(d); // a function call.
}
void doodah(char* a) {
cout << a << endl; // this outputs "message" - but why?! What does 'a' mean in this context?
// cout << *a << endl; // this outputs "m" - but why?! REM'ed out for now.
}
I am utterly confused! Please help.
cout knows how to output strings when given a char *. It does not attempt to print the pointer value itself.
An array is a bunch of objects next to each other somewhere in memory. The variable you've set the array to is actually secretly a pointer to the first item in that array (shhh!).
The biggest difference between char *c and char c[] is that the latter will be a const pointer, while the former is free to change. Also, C-Strings, like the one you have set there, are null terminated, meaning that the array ends in a binary 0 so things like cout will know when to stop iterating (this is also known as a one pass last).
For more information, you can read up on this question.
This is what the pointer a looks like in memory:
-------------------------------------------------------------
| | 'm' | 'e' | 's' | 's' | 'a' | 'g' | 'e' | '\0' |
| ^ |
| | |
| --- |
| |a| |
| --- |
-------------------------------------------------------------
a is a pointer to the first element of the string. When used in the stream inserter (operator<<()) the compiler will match it with the overload that takes a stream on its left hand side and a pointer to a character on its right hand side. It will then attempt to print every character until it reaches the null byte ('\0') by evaluating characters at incremental addresses from a.
The stream prints addresses through the overload that takes a void* on its righthand side. You can cast your pointer to a void* or use the standard-provided std::addressof() function as well:
std::cout << std::addressof(a);

What is the significance of a pointer to a pointer?

What is the difference between doing this:
int i = 5, j = 6, k = 7;
int *ip1 = &i, *ip2 = &j;
int *ipp = ip1;
and doing this:
int **ipp2 = &ip1;
Don't they do the same thing? hold a pointer(ip1) which points to a variable, i?
ipp2 points to ip1. This is entirely different from pointing to i.
Sample code:
int *ip1 = &i;
int **ipp2 = &ip1;
printf("%d\n", **ipp2); // 5
ip1 = &j;
printf("%d\n", **ipp2); // 6
All the variables have a location in memory where their values are held. Let's explore the relationships between the values of i, ip1, ipp, and ipp2
This is what you get when the statement i = 5; is executed. i has its location in memory and the value at location is set to 5.
i -> +--------+
| 5 |
+--------+
^
|
A1 (address of i)
This is what you get when the statement int* ip1 = &i; is executed. ip1 has its location in memory and the value at that location is set to the address of i, which we designate as A1.
ip1 -> +--------+
| A1 |
+--------+
^
|
A2 (address of ip1)
This is what happens when you execute the statement int* ipp = ip1;. The value at the memory location of ipp is set to the value of ip1, which is A1.
ipp -> +--------+
| A1 |
+--------+
^
|
A3 (address of ipp)
This is what happens when you execute the statement int** ipp2 = &ipp;. The value at the memory location of ipp2 is set to A3, which is the address of `ip1.
ipp2 -> +--------+
| A3 |
+--------+
^
|
A4 (address of ipp2)
How does dereferencing work:
*ip1 = *A1 = 5
*ipp = *A1 = 5
*ipp2 = *A3 = A1
**ipp2 = **A3 = *A1 = 5
Pointers are often used to change the value of a variable inside a function:
void incr(int *ip) { *ip++; }
void f() { int i = 0; incr(&i); printf("%d\n", i); // 1
Now it's not any different with a pointer to a pointer. You can pass the pointer to a pointer to a function, and that function can change what that pointer points to: the original pointer!
char *mom = "mom";
char *pop = "pop";
chooseMomOrPop(int choosePop, char **momOrPop) { *momOrPop = choosePop ? pop : mom; }
void f() { char *mp = mom; chooseMomOrPop(1, &mom); printf("%s\n", mom); } // pop
Very simple: a pointer is an address where a variable stays in memory. Since a pointer is itself a variable, its address could be stored in another pointer, and so on. To better have in mind how a pointer is and how it works, just think it is an address. The type of the pointer, ie int in int * refers to the type of the data pointed, and effect how the pointer "react" to addition or subtraction, as described in pointer arithmetic. A 'pointer to pointer of int' is an int**, so is always an address pointing to an int*, and when incremented it will move the address to as many byte as necessary to point the next int*

differences between new char[n] and new (char[n])

Is there any difference between new char[n] and new (char[n])?
I have the second case in a generated code, g++ (4.8.0) gives me
ISO C++ does not support variable-length array types [-Wvla]
This makes me think if these two are the same or not.
new char[n] means "allocate n objects of type char.
does new (char[n]) mean "allocate 1 object of type array of n chars"?
deleting the first is clear.
should I delete the second with delete or delete[]?
are there any other differences I should be aware of?
may I safely remove the parentheses and turn the second case into the first, when other parts of the software expect the second?
The code is generated by a third party software (and used by other parts of the software), so I cannot just "use vector instead".
This is minimal example:
int main (void)
{
int n(10);
int *arr = new (int[n]); // removing parentheses fixes warning
*arr = 0; // no "unused variable" warning
return 0;
}
The basic issue here is that C++ does not allow an array bound [n] to be used in a type unless n is a constant expression. g++ and some other compilers will sometimes allow it anyway, but it's impossible to get consistent behavior when you start mixing variable-length-arrays and templates.
The apparent exception int* p = new int[n]; works because here the [n] is syntactically part of the new expression, not part of the type provided to the new, and new does "know how" to create arrays with length determined at runtime.
// can be "constexpr" in C++11:
const int C = 12;
int main() {
int* p1 = new int[C];
int* p2 = new (int[C]);
typedef int arrtype[C];
int* p3 = new arrtype;
int n = 10;
int* p4 = new int[n];
// int* p5 = new (int[n]); // Illegal!
// typedef int arrtype2[n]; // Illegal!
// int* p6 = new arrtype2;
delete[] p1;
delete[] p2;
delete[] p3;
delete[] p4;
}
Semantically, though, after any final [C] is used to convert a type into an array type, the new expression only cares about whether it's dealing with an array or not. All the requirements about type of the expression, whether to use new[] and delete[], and so on say things like "when the allocated type is an array", not "when the array new syntax is used". So in the example above, the initializations of p1, p2, and p3 are all equivalent, and in all cases delete[] is the correct deallocation form.
The initialization of p4 is valid, but the code for p5 and p6 is not correct C++. g++ would allow them anyway when not using -pedantic, and by analogy I'd expect the initializations for p4, p5, and p6 to also all be equivalent. #MM's disassembly supports that conclusion.
So yes, it should be a safe improvement to remove the "extra" parentheses from this sort of expression. And the correct deletion is the delete[] type.
new T[N] makes an array of N elements of type T.
new (T[N]) makes a single object of type T[N].
The effect is the same (and both expressions yield a T * that points to the first element of the array and needs to be deleted with delete[] (cf. 5.3.4/5), but clearly T[N] must be a valid type in the latter case, so N must be a constant expression, while in the former case it is a dynamic argument of the array-new expression.
new int [n]
//Allocates memory for `n` x `sizeof(int)` and returns
//the pointer which points to the beginning of it.
+-----+-----+-----+-----+-----+-----+-----+-----+------+------+
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+------+------+
new (int [n])
//Allocate a (int[n]), a square which its item is an array
+----------------------------------------------------------------+
|+-----+-----+-----+-----+-----+-----+-----+-----+------+------+ |
|| | | | | | | | | | | |
|| | | | | | | | | | | |
|| | | | | | | | | | | |
|+-----+-----+-----+-----+-----+-----+-----+-----+------+------+ |
+----------------------------------------------------------------+
In fact both of them are equal.
Here is the code generated by assembler (just an ignorable difference):
int n = 10;
int *i = new (int[n]);
int *j = new int[n];
i[1] = 123;
j[1] = 123;
----------------------------------
! int *i = new (int[n]);
main()+22: mov 0x1c(%esp),%eax
main()+26: sub $0x1,%eax
main()+29: add $0x1,%eax
main()+32: shl $0x2,%eax
main()+35: mov %eax,(%esp)
main()+38: call 0x401620 <_Znaj> // void* operator new[](unsigned int);
main()+43: mov %eax,0x18(%esp)
! int *j = new int[n];
main()+47: mov 0x1c(%esp),%eax
main()+51: shl $0x2,%eax
main()+54: mov %eax,(%esp)
main()+57: call 0x401620 <_Znaj> // void* operator new[](unsigned int);
main()+62: mov %eax,0x14(%esp)
!
! i[1] = 123;
main()+66: mov 0x18(%esp),%eax
main()+70: add $0x4,%eax
main()+73: movl $0x7b,(%eax)
! j[1] = 123;
main()+79: mov 0x14(%esp),%eax
main()+83: add $0x4,%eax
main()+86: movl $0x7b,(%eax)
You must delete both of them by delete [] ...