I read a question on the difference between:
const char*
and
const char[]
where as for a while, I though arrays were just syntactic sugar for pointers.
But something is bugging me, I have a pice of code similar to the following:
namespace SomeNamespace {
const char* str = { 'b', 'l', 'a', 'h' };
}
I get, error: scaler object 'str' requires one element in initializer.
So, I tried this:
namespace SomeNamespace {
const char str[] = { 'b', 'l', 'a', 'h' };
}
It worked, at first I thought this may have to do with the fact that an extra operation is applied
when it is a const char*, and GCC is never a fan of operations being performed outside a function (which is bad practice anyway), but the error does not seem to suggest so.
However in:
void Func() {
const char* str = { 'b', 'l', 'a', 'h' };
}
It compiles just fine as expected. Does anyone have any idea why this is so?
x86_64/i686-nacl-gcc 4(.1.4?) pepper 19 tool - chain (basically GCC).
First off, it doesn't make a difference if you try to use compound initialization at namespace scope or in a function: neither should work! When you write
char const* str = ...;
you got a pointer to a sequence of chars which can, e.g., be initialized with a string literal. In any case, the chars are located somewhere else than the pointer. On the other hand, when you write
char const str[] = ...;
You define an array of chars. The size of the array is determined by the number of elements on the right side and, e.g., becomes 4 your example { 'b', 'l', 'a', 'h' }. If you used, e.g., "blah" instead the size would, of course, be 5. The elements of the array are copied into the location where str is defined in this case.
Note that char const x[] can be equivalent to writing char const* x in some contexts: when you declare a function argument, char const x[] actually is the same as char const*.
Related
I am trying to understand how pointers,arrays and string literals work in C++.
Suppose we have the following line of code:
const char* const letters[] = {"A+","A"};
If I understand correctly, this declaration declares letters to be an array of constant pointers to constant characters. From my understanding, the compiler will actually convert each string literal to a null terminated char array and each element of letters is actually a constant pointer to the first element of that array.
So, for instance, letters[0] is actually a pointer to the "A" of "A+". However
std::cout<< letters[0];
actually outputs "A+" to the standard output. How can this be? Especially since letters[0] is a constant pointer?
My second question is related to the declaration above: if string literals are actually const char arrays, then why does the following line of code
const char* const letters[] = {{'A','+','\0'},{'A','\0'}};
throws
error: braces around scalar initializer for type ‘const char* const’
const char* const letters[] = {{'A','+','\0'},{'A','\0'}};
^
Thank you!
The standard specifies that a string literal is represented - as far as your program is concerned - as an array of const characters of static storage duration with a trailing '\0' terminator. The standard doesn't specify HOW a compiler achieves this effect, only that your program can treat the string literal in that way.
So modifying a string literal is either prevented (e.g. passing a string literal to a function expecting a char * is a diagnosable error, and the code will not compile) or - if code works around the type system to modify any character in a string literal - involves undefined behaviour.
In your example, letters[0] is of type const char *, and has a value equal to the address of the first character in the string literal "A+".
std::cout, being of type std::ostream, has an operator<<() that accepts a const char *. This function is called by the statement std::cout << letters[0] and the function assumes the const char * points at a zero-terminated array of char. It iterates over that array, outputting each character individually, until it encounters the trailing '\0' (which is not output).
The thing is, a const char * means that the pointer is to a const char, not that the pointer cannot be changed (that would be char * const). So it is possible to increment the pointer, but not change the value it points at. So, if we do
const char *p = letters[0];
while (*p != '\0')
{
std::cout << *p;
++p;
}
which loops over the characters of the string literal "A+", printing each one individually, and stopping when it reaches the '\0' (the above produces the same observable output std::cout << letters[0]).
However, in the above
*p = 'C';
will not compile, since the definition of p tells the compiler that *p cannot be changed. However, incrementing p is still allowed.
The reason that
const char* const letters [] = {{'A','+','\0'},{'A','\0'}};
does not compile is that an array initialiser cannot be used to initialise pointers. For example;
const int *nums = {1,2,3}; // invalid
const * const int nums2 [] = {{1,2,3}, {4,5,6}}; // invalid
are both illegal. Instead, one is required to define arrays, not pointers.
const int nums[] = {1,2,3};
const int nums2[][3] = {{1,2,3}, {4,5,6}};
All versions of C and C++ forbid initialising pointers (or arrays of pointers in your example) in this way.
Technically, the ability to use string literals to initialise pointers is actually the anomaly, not the prohibition on initialising pointers using arrays. The reasons C introduced that exemption for string literals are historical (in very early days of C, well before K&R C, string literals could not be used to initialise pointers either).
As for your first question, the type of letters[0] is const char * const. This is a pointer to a character, but not a character itself. When passing a pointer to a character to std::cout, it will treat it as a NUL-terminated C string, and writes out all characters from the start of the memory pointed to until it encounters a NUL-byte. So that is why the output will be A+. You can pass the first character of the first string by itself by writing:
std::cout << letters[0][0];
The fact that the pointers and/or the C strings themselves are const doesn't matter here, since nothing is writing to them.
As for your second question, const char * const declares a single array, but you are providing a nested array on the right-hand side of that statement. If you really wanted two arrays of characters, write:
const char *const letters[] = {{'A', '+', '\0'}, {'A', '\0'}};
That is equal to your code form the first question. Or if you want a single array:
const char *const letters = {'A', '+', '\0', 'A', '\0'};
That line is equal to:
const char *const letters = "A+\0A";
I am a beginner student in c++ and there is one thing I cannot understand when working with character arrays: So, I know that pointers are essentially variables that "point" to the memory address of another variable, and that the name of an array(Ex: int a[20]) is a constant pointer to the values in that array. When working with different numeric types(int, float etc.) if we output through a message the name of that array it shows the address of the first element, but if we do the same with a char type, it doesn't show the address, but the value of the variable.
Example:
#include <iostream>
using namespace std;
int main()
{int a[]={1,2,3,4,5};
cout<<a<<endl; //through output, it shows the memory address of the first
element of the array;
char b[]={"Mountain"};
cout<<b; //It outputs the word "Mountain"
return 0;
}
Is the pointer from a char array automatically converted to its value when you output it?
There is no difference. char pointers aren't somehow magically different than int pointers. So what's going on, than?
std::cout << (or the older printf()) have overloads for char*. Meaning that the functions behave differently if the input is a char*: the pointer is iterated until a '\0' character is reached (see null terminated string).
char b[]={"Mountain"};
b does not contain
{'M', 'o', 'u', 'n', 't', 'a', 'i', 'n'}
but instead
{'M', 'o', 'u', 'n', 't', 'a', 'i', 'n', '\0'} <- '\0'
making the iterating and stopping possible.
This also explains why the array size of b is 1 larger than the number of characters inside the word.
To add, you should not use these char pointers. They are dangerous and are long replaced by modern utilites like std::string.
now int a[]={1,2,3,4,5}; is OK but std::array<int, 5> a = {1,2,3,4,5}; is even better.
the types are unique (std::array<int, 4> != std::array<int, 5>)
it has a .size() function.
you can therefore pass it to other functions without having to add a size argument
it's as fast as a normal array
std::array can be used by including <array>.
If you ever go for something like int* a = new int[5]; than stop right there and instead use std::vector
Fianally never ever say use namespace std; (here why)
It all depends on how you interpret the parameters. cout << operator will consider (sometype*) as an address, but particularly char* as a string.
If you write a function taking your own parameters, you can interpret what ever the way you like.
In this problem, if you want to get the address, you can do it so
std::cout << static_cast<const void*>(b);
In C, strings are represented as a pointer to char. for this reason, when you pass a char* to an ostream (such as std::cout) in C++, it will interpret it as a null-terminated string and print that string's content rather than the address. If you want to print the address, you'll have to cast that pointer to a different kind:
std::cout << (void*)b;
cout is an output stream. When we use output streams, and pass a char*, it treats it as a null terminated string (i.e. it prints all the characters till it find '\0') in the string. For any other pointer type, the address is printed.
Why can I create a string or array of chars in this way:
#include <iostream>
int main() {
const char *string = "Hello, World!";
std::cout << string[1] << std::endl;
}
? and it outputs the second element correctly, while I can't make an array of integer type without the array's subscript notation [ ]? What's the difference between the char's one and this one: const int* intArray={3,54,12,53};.
The "why" is: "Because string literals are special". The string literal is stored in the binary, as a constant part of the program itself, and const char *string = "Hello, World!"; is just treating the literal as an anonymous array stored elsewhere which it then stores a pointer to in string.
There is no equivalent special behavior for other types, but you can get the same basic solution by making a named static constant and using that to initialize the pointer, e.g.
int main() {
static const int intstatic[] = {3,54,12,53};
const int *intptr = intstatic;
std::cout << intptr[1] << std::endl;
}
The effect of the static const array is to allocate the same constant space the string literal would use (though unlike string literals, it's less likely that the compiler will identify duplicate arrays and coalesce the storage), but as a named variable rather than an anonymous one. The string case could be made explicit in the same way:
int main() {
static const char hellostatic[] = "Hello, World!";
const char *string = hellostatic;
std::cout << string[1] << std::endl;
}
but using the literal directly makes things a little cleaner.
You almost can. There are a couple of things at work.
{1,2,3} and "abc" are not the same thing. In fact, if you wanted to draw a comparison, "abc" should rather be compared to {'a', 'b', 'c', '\0'}. Both of them are valid array initializers:
char foo[] = "abc";
char bar[] = {'a', 'b', 'c', '\0'};
However, only "abc" is also a valid expression to initialize a pointer in C++.
In C (and as an extension in some C++ compilers, including Clang and GCC), you can cast compound literals to an array type, like this:
static const int* array = (const int[]){1, 2, 3};
However, this is almost never correct. It works at the global scope and as a function argument, but if you try to initialize a variable of automatic storage with it (i.e. a variable within a function), you'll get a pointer to a location that is about to expire, so you won't be able to use it for anything useful.
Such a feature exists in C and is named compound literal.
For example
#include <stdio.h>
int main(void)
{
const int *intArray = ( int[] ){ 3, 54, 12, 53 };
printf( "%d\n", intArray[1] );
return 0;
}
However C++ does not support this feature from C.
There is a difference compared with string literals. String literals have static storage duration independent on where they appear while compound literals have either static storage duration or automatic storage duration dependent on where they are appear.
In C++ something that is close to this feature is std::initializer_list . For example
#include <iostream>
#include <initializer_list>
int main()
{
const auto &myArray = { 3, 54, 12, 53 };
std::cout << myArray.begin()[1] << std::endl;
return 0;
}
The strings litterals come from the C language. Any string declared with double quotes in the code is automatically converted as a const char[].
So this:
const char str[6] = "hello";
Is exactly the same as:
const char str[6] = { 'h', 'e', 'l', 'l', 'o', '\0' };
By accident I found that the line char s[] = {"Hello World"}; is properly compiled and seems to be treated the same as char s[] = "Hello World";. Isn't the first ({"Hello World"}) an array containing one element that is an array of char, so the declaration for s should read char *s[]? In fact if I change it to char *s[] = {"Hello World"}; the compiler accepts it as well, as expected.
Searching for an answer, the only place I found which mentioned this is this one but there is no citing of the standard.
So my question is, why the line char s[] = {"Hello World"}; is compiled although the left side is of type array of char and the right side is of type array of array of char?
Following is a working program:
#include<stdio.h>
int main() {
char s[] = {"Hello World"};
printf("%s", s); // Same output if line above is char s[] = "Hello World";
return 0;
}
Thanks for any clarifications.
P.S. My compiler is gcc-4.3.4.
It's allowed because the standard says so: C99 section 6.7.8, §14:
An array of character type may be initialized by a character string literal, optionally
enclosed in braces. Successive characters of the character string literal (including the
terminating null character if there is room or if the array is of unknown size) initialize the
elements of the array.
What this means is that both
char s[] = { "Hello World" };
and
char s[] = "Hello World";
are nothing more than syntactic sugar for
char s[] = { 'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', 0 };
On a related note (same section, §11), C also allows braces around scalar initializers like
int foo = { 42 };
which, incidentally, fits nicely with the syntax for compound literals
(int){ 42 }
The braces are optional, and the expression is equivalent to just an array of char.
You can also write this:
int a = {100}; //ok
Demo : http://ideone.com/z0psd
In fact, C++11 generalizes this very syntax, to initialize non-arrays as well as arrays, uniformly. So in C++11, you can have these:
int a{}; //a is initialized to zero, and it is NOT an array
int b[]{1,2,3,4}; //b is an array of size 4 containing elements 1,2,3,4
int c[10]{}; //all 10 elements are initialized to zero
int *d{}; //pointer initialized to nullptr
std::vector<int> v{1,2,3,4,5}; //vector is initialized uniformly as well.
Any variable in (int, char, etc.) is just an array of length 1.
char s = {0};
works as well.
I might be wrong, but I think this is not an array of arrays of chars, but a block contains an array of chars. int a = {1}; may work as well.
[...] In fact if I change it to
char *s[] = {"Hello World"}; the compiler accepts it as well, as
expected
The compiler accepets it,because actually, you're making an array 2D of undefined size elements,where you stored one element only,the "Hello World" string. Something like this:
char* s[] = {"Hello world", "foo", "baa" ...};
You can't omit the bracets in this case.
This is allowed by the C++ standard as well, Citation:
[dcl.init.string] §1
An array of narrow character type ([basic.fundamental]), char16_t array, char32_t array, or wchar_t array can be initialized by a narrow string literal, char16_t string literal, char32_t string literal, or wide string literal, respectively, or by an appropriately-typed string literal enclosed in braces ([lex.string]). [snip]
I'm new to c++ and I'd like to know right from the start,
Does any of these methods of making strings work exactly the same way and give the exact same results always in every case? is there any difference in the result in any of them?
1) char greeting [6] = { 'h','e','l','l','o','\0' };
2) char greeting[] = "hello";
3) #include <string>
string greeting = "hello";
1) and 2) work exactly the same. Both create a 6-element non-heap-allocated array, and copy the characters 'h', 'e', 'l', 'l', 'o', '\0' to the array at runtime or load time.
3) creates an instance of std::string and calls its constructor which copies the characters 'h', 'e', 'l', 'l', 'o'(, '\0')* to its internal memory buffer. (* The '\0' is not required to be stored in the memory buffer.)
There is another way to declare a string in C++, using a pointer to char:
const char* greeting = "hello";
This will not copy anything. It will just point the pointer to the first character 'h' of the null-terminated "hello" string which is located somewhere in memory. The string is also read-only (modifying it causes undefined behavior), which is why one should use a pointer-to-const here.
If you're wondering which one to use, choose std::string, it's the safest and easiest.
Do these methods of making strings work exactly the same way and give the exact same results always in every case?
The first two are array definitions:
char greeting [6] = { 'h','e','l','l','o','\0' };
char greeting [ ] = "hello";
"work the same" as in the second definition a '\0' is appended implicitly.
As for the third definition:
string greeting = "hello";
A string is a class type object and as such it is more complex than a simple array.
Is there any difference in the result in any of them?
There is a quantitative1 and qualitative2 difference between the first two and the third stemming from the fact that std::string is a class type object.
1. Quantitative: arrays occupy less memory space than string.
2. Qualitative: string provides resource management and many facilities for element manipulation.