Difference between char[] and char*? [duplicate]

Difference between char[] and char*? [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
C - Difference between “char var[]” and “char *var”?
Difference between char a[]=“string”; char *p=“string”;
would someone explain what exactly the difference between char[] and char* is?
for example difference between
char name[] = "earth";
and
char *name = "earth";
thanks

char namea[] = "earth";
char *pname = "earth";
One is an array (the name namea refers to a block of characters).
The other is a pointer to a single character (the name pname refers to a pointer, which just happens to point to the first character of a block of characters).
Although the former will often decay into the latter, that's not always the case. Try doing a sizeof on them both to see what I mean.
The size of the array is, well, the size of the array (six characters, including the terminal null).
The size of the pointer is dependent on your pointer width (4 or 8, or whatever). The size of what pname points to is not the array, but the first character. It will therefore be 1.
You can also move pointers with things like pname++ (unless they're declared constant, with something like char *const pname = ...; of course). You can't move an array name to point to it's second character (namea++;).

(1) char name[] = "earth";
name is an character array having the contents as, 'e','a','r','t','h',0. The storage location of this characters depends on where name[] is declared (typically either stack or data segment).
(2) char *name = "earth";
name is a pointer to a const string. The storage location of "earth" is in read-only memory area.
In C++, this is deprecated and it should be const char *name = "earth";

char name[]= "earth"; creates a mutable array on the stack with the size of 6 with the value earth\0.
char* name = "earth"; defines a pointer to a string constant with the value earth\0.

char[] describes an array of char with a fixed number of elements.
char* describes a pointer to a char, typically followed in memory by a sequence of char's typically terminated by a null char \0

With
char *name = "earth"
you must not modify the contents of name.
Hence
name[2] = 'A';
char* is terminated by a '\0' character while name[] has fixed size.
will cause a segfault.
Initializing the variable takes a huge performance and space penalty
for the array. Only use the array method if you intend on changing the
string, it takes up space in the stack and adds some serious overhead
every time you enter the variable's scope. Use the pointer method
otherwise.

Related

Intricacies of strcpy_s in C++ [duplicate]

This question already has answers here:
How to find the size of an array (from a pointer pointing to the first element array)?
(17 answers)
Closed 7 years ago.
I am having a difficult time obtaining the correct size of a string in order to satisfy strcpy_s. For example if I specify
char buffer = {0};
char *str1 = (char*)&buffer;
strcpy_s(str1,sizeof("This is a string\n"),"This is a string\n");
Then it will work as expected. If however I declare the following:
char buffer = {0};
char *str1 = (char*)&buffer;
const char* string1 = "This is a string.....";
strcpy_s(str1, ?????,string1);
If I use anything other than a literal in place of ????? it will fail with a memory exception, for example if I use std:strlen(str1), etc. Any size literal for ???? will work. Of course using a fixed literal is not acceptable.
This is a major re-edit of the original question and I apologise to the people who have answered to date. However none of the the answers below have worked.

"This is a string" is a character array. When you say sizeof(Array)/sizeof(type) it will give the size of the array
When you define the string as const char* then the sizeof(pointer) gives the size allocated for the pointer no the array size
const char* ptr = "This is a string\n";
std::cout<<sizeof("This is a string\n")<<std::endl; //==>18
std::cout<<sizeof(ptr)<<std::endl; //==>4

First of all, the second parameter is the size of the destination buffer, not the size of the source buffer.
so the correct way is:
char str1[100];
strcpy_s(str1, sizeof str1, "Whatever string");
or
int n = 100;
char *str1 = new char[n];
strcpy_s(str1, n, "whatever string");
For an array (first example) sizeof returns the size of the array.
For a pointer (second example) sizeof returns the size of the pointer (which is not what you want)

In your second example, string1 is of type const char*. sizeof will return the size of the pointer, rather than the length of the string literal you are pointing to.
The first example works because a string literal is a const char[], and sizeof will correctly return the length of the string (but with the null terminating character as well). It's only coincidental that this works because char is 1 byte. Do not use sizeof to get string lengths.
To make your second example work, try using std::strlen.

pointer to string and char catch 22

I'm studying on pointers and I'm stuck when I see char *p[10]. Because something is misunderstood. Can someone explain step-by-step and blow-by-blow why my logic is wrong and what the mistakes are and where did I think wrong and how should I think. Because I want to learn exactly. Also what about int *p[10]; ? Besides, for example x is a pointer to char but just char not chars. But how come char *x = "possible";
I think above one should be right but, I have seen for char *name[] = { "no month","jan","feb" }; I am really confused.

Your char *p[10] diagram shows an array where each element points to a character.
You could construct it like this:
char f = 'f';
char i = 'i';
char l1 = 'l';
char l2 = 'l';
char a1 = 'a';
char r1 = 'r';
char r2 = 'r';
char a2 = 'a';
char y = 'y';
char nul = '\0';
char *p[10] = { &f, &i, &l1, &l2, &a1, &r1, &r2, &a2, &y, &nul };
This is very different from the array
char p[10] = {'f', 'i', 'l', 'l', 'a', 'r', 'r', 'a', 'y', '\0'};
or
char p[10] = "fillarray";
which are arrays of characters, not pointers.
A pointer can equally well point to the first element of an array, as you've probably seen in constructions like
const char *p = "fillarray";
where p holds the address of the first element of an array defined by the literal.
This works because an array can decay into a pointer to its first element.
The same thing happens if you make an array of pointers:
/* Each element is a pointer to the first element of the corresponding string in the initialiser. */
const char *name[] = { "no month","jan","feb" };
You would get the same results with
const char* name[3];
name[0] = "no month";
name[1] = "jan";
name[2] = "feb";

char c = 'a';
Here, c is a char, typically a single byte of ASCII encoded data.
char* ptr = &c;
ptr is a char pointer. In C, all it does is point to a memory location and doesn't make any guarantees about what is at that location. You could use a char* to pass a char to a function to allow the function to allow the function to make changes to that char (pass by reference).
A common C convention is for a char* to point to a memory location where several characters are stored in sequence followed by the null character \0. This convention is called a C string:
char const* cstr = "hello";
cstr points to a block of memory 6 bytes long, ending with a null character. The data itself cannot be modified, though the pointer can be changed to point to something else.
An array of chars looks similar, but behaves slightly differently.
char arr[] = "hello";
Here arr IS a memory block of 6 chars. Since arr represents the memory itself, it cannot be changed to point to another location. The data can be modified though.
Now,
char const* name[] = { "Jan", " Feb"..., "Dec"};
is an array of pointer to characters.
name is a block of memory, each containing a pointer to a null-terminated string.
In the diagram, I think string* was accidentally used instead of char*. The difference between the left and the right, is not a technical difference really, but a difference in the way a char* is used. On the left each char* points to a single character, whereas in the one on the right, each char* points to a null-terminated block of characters.

Both are right.
A pointer in C or C++ may point either to a single item (a single char) or to the first in an array of items (char[]).
So a char *p[10]; definition may point to 10 single characters or 10 arrays (i.e. 10 strings).

Let’s go back to basics.
First, char *p is simply a pointer. p contains nothing more than a memory address. That memory address can point to anything, anywhere. By convention, we have always used NULL (or, I hate this method, assigning it to zero – yeah, they are the same “thing”, but NULL has traditionally been used in conjunction with pointers, so when you’re eyes flit across the code, you see NULL – you think “pointer”).
Anyway, that memory address being pointed to can contain anything. So, to use within the language, we type it, in this case it is a pointer to a character (char *p). This can be overridden by type casting, but that’s for a later time.
Second, we know anytime we see p[10], that we are dealing with an array. Again, the array can be an array of characters, an array of ints, etc. – but it’s still an array.
Your example: char *p[10], is then nothing more than an array of 10 character pointers. Nothing more, nothing less. Your problem comes in because you are trying to force the “string” concept onto this. There ain’t no strings in C. There ain’t no objects in C. The concept of a NULL-terminated string can most certainly be used. But a “string” in C is nothing more than an array of characters, terminated by a NULL (or, if you use some of the appropriate functions, you can use a specific number of characters – strncpy instead of strcpy, etc.). But, for all its appearance, and apparent use, there are no strings in C. They are nothing more than arrays of characters, with a few supporting functions that happen to stop going through the array when a NULL is encountered.
So – char a[10] – is simply an array of characters that is 10 characters long. You can fill it with any characters you wish. If one of those is the NULL character, then that terminates what is typically called a “C-style string”. There are functions that support this type of character array (i.e. “string”), but it is still a use of a character array.
Your confusion comes in because you are trying to mix C++ string objects, and forcing that concept onto C arrays of characters. As ugoren noted – your examples are both correct – because you are dealing with arrays of character pointers, NOT strings. Again, putting a NULL somewhere in that character array is happily supported by several C functions that give you the ability to work with a “string-like” concept – but they are not truly strings. Unless of course, you want to phrase it that a string is nothing more than one character following another – an array.

how to understand char * ch="123"?

How should I understand char * ch="123"?
'1' is a char, so I can use:
char x = '1';
char *pt = &x;
But how do I understand char *pt="123"? Why can the char *pt point to string?
Is pt's value the first address value for "123"? If so, how do I get the length for the string pointed to by pt?

That is actually a really good question, and it is the consequence of several oddities in the C language:
1: A pointer to a char (char*) can of course also point to a specific char in an array of chars. That is what pointer arithmetic relies on:
// create an array of three chars
char arr[3] = { 'a', 'b', 'c'};
// point to the first char in the array
char* ptr = &arr[0]
// point to the third char in the array
char* ptr = &arr[2]
2: A string literal ("foo") is actually not a string as such, but simply an array of chars, followed by a null byte. (So "foo" is actually equivalent to the array {'f', 'o', 'o', '\0'})
3: In C, arrays "decay" into pointers to the first element. (This is why many people incorrectly says that "there is no difference between arrays and pointers in C"). That is, when you try to assign an array to a pointer object, it sets the pointer to point to the first element of the array. So given the array arr declared above, you can do char* ptr = arr, and it means the same as char* ptr = &arr[0].
4: In every other case, syntax like this would make the pointer point to an rvalue (loosely speaking, a temporary object, which you can't take the address of), which is generally illegal. (You can't do int* ptr = &42). But when you define a string literal (such as "foo"), it does not create an rvalue. Instead, it creates the char array with static storage. You're creating a static object, which is created when the program is loaded, and of course a pointer can safely point to that.
5: String literals are actually required to be marked as const (because they are static and read-only), but because early versions of C did not have the const keyword, you are allowed to omit the const specifier (at least prior to C++11), to avoid breaking old code (but you still have to treat the variable as read-only).
So char* ch = "123" really means:
write the char array {'1', '2', '3', '\0'} into the static section of the executable (so that when the program is loaded into memory, this variable is created in a read-only section of memory)
when this line of code is executed, create a pointer which points to the first element of this array
As a bonus fun fact, this differs from char ch[] = "123";, which instead means
write the char array {'1', '2', '3', '\0'} into the static section of the executable (so that when the program is loaded into memory, this variable is created in a read-only section of memory)
when this line of code is executed, create an array on the stack which contains a copy of this statically allocated array.

char* ptr = "123"; is compatible and almost equivalent to char ptr[] = { '1', '2', '3', '\0' }; (see http://ideone.com/rFOk3R).
In C a pointer can point to one value or an array of contiguous values. C++ inherited this.
So a string is just an array of character (char) ended by a '\0'. And a pointer to char can point to an array of char.
The length is given by the number of character between the begining and the terminal '\0'. Exemple of C strlen giving you the length of the string:
size_t strlen(const char * str)
{
const char *s;
for (s = str; *s; ++s) {}
return(s - str);
}
An yes it fails horribly if there is no '\0' at the end.

A string literal is an array of N const char where N is the length of the literal including the implicit NUL terminator. It has static storage duration and it's implementation defined where it is stored. From here on, it's the same a with a normal array - it decays to a pointer to its first character - that's a const char*. What you have there is not legal (not anymore since onset of C++11 standard) in C++, it should be const char* ch = "123";.
You can get the length of a literal with sizeof operator. Once it decays to a pointer, though, you need to iterate through it and find the terminator (that's what strlen function does).
So, with a const char* ch; you get a pointer to a constant character type that can point to a single character, or to the start of an array of characters (or anywhere between the start and the end). The array can be dynamically, autimatically or statically allocated and can be mutable or not.
In something like char ch[] = "text"; you have an array of characters. This is syntatic sugar for a normal array initializer (as in char ch[] = {'t','e','x','t','\0'}; but note that the literal will still be loaded at the start of the program). What hapens here is:
an array with automatic storage duration is allocated
its size is deduced from the size of the literal by the compiler
the contents of the literal are copied to the array
As a result, you have a region of storage that you can use at will (unlike literals, which must not be written into).

There are no strings in C, but there are pointers to characters.
*pt is indeed not pointing to a string, but to a single characters (the '1').
However, some functions take char* as argument assume that the byte on the address following the address that their argument points to, is set to 0 if they are not to operate on it.
In your example, if you tried using pt on a function which expects a "null terminated string" (basically, which expects that it will encounter a byte with a value of 0 when it should stop processing data) you will run into a segmentation fault, as x='1' gives x the ascii value of the 1 character, but nothing more, whereas char* pt="123" gives pt the value of the address of 1, but also puts into that memory, the bytes containing ascii values of 1, 2,3 followed by a byte with a value of 0 (zero).
So the memory (in a 8 bit machine) may look like this:
Address = Content (0x31 is the Ascii code for the character 1 (one))
0xa0 = 0x31
0xa1 = 0x32
0xa2 = 0x33
0xa3 = 0x00
Let's suppose that you in the same machine char* otherString = malloc(4),suppose that malloc returns a value of 0xb0, which is now the value of otherString, and we wanted to copy our "pt" (which would have a value of 0xa0) into otherString, the strcpy call would look like so:
strcpy( otherString, pt );
The same as
strcpy( 0xb0, 0x0a );
strcpy would then take the value of address 0xa0 and copy it into 0xb0, it would increment it's pointers to "pt" to 0xa1, check if 0xa1 is zero, if it is not zero, it would increment it's pointer to "otherString" and copy 0xa1 into 0xb1, and so on, until it's "pt" pointer is 0xa3, in this case, it will return as it detected that the end of the "string" has been reached.
This is of cause, not 100% how it goes on, and it could be implemented in many different ways.
Here is one http://fossies.org/dox/glibc-2.18/strcpy_8c_source.html

A pointer to an array?
A pointer points to only one memory address. The phrase that a pointer points to an array is only used in a loose sense---a pointer cannot really store multiple addresses at the same time.
In your example, char *ch="123", the pointer ch is really pointing to the first byte only. You can write code like the following, and it will make perfect sense:
char *ch = new char [1024];
sprintf (ch, "Hello");
delete [] ch;
char x = '1';
ch = &x;
Please note the use of the pointer ch to point to both the memory allocated by new char [1024] line as well as the address of the variable x, while still being the same pointer type.
C-style strings are null terminated
Strings in C used to be null terminated, i.e., a special '\0' was added to the end of the string and assumed to be there for all char * based functions (such as strlen and printf) This way, you can determine the length of the string by starting at the first byte and continue till you find the byte containing 0x00.
A verbose, sample implementation of anstrlen style function would be
int my_strlen (const char *startAddress)
{
int count = 0;
char *ptr = startAddress;
while (*ptr != 0)
{
++count;
++ptr;
}
return count;
}

char* pt = "123"; does two things:
1. creates the string literal "123" in ROM (this is usually in .text section)
2. creates a char* which is assigned the beginning of memory location where the string is located.
because of this operations like pt[1] = '2'; are illegal as you would be attempting to write to ROM memory.
But you can assign the pointer to some other memory location without any problems.

confusion about char pointer in c++

I'm new in c++ language and I am trying to understand the pointers concept.
I have a basic question regarding the char pointer,
What I know is that the pointer is a variable that stores an address value,
so when I write sth like this:
char * ptr = "hello";
From my basic knowledge, I think that after = there should be an address to be assigned to the pointer, but here we assign "hello" which is set of chars.
So what does that mean ?
Is the pointer ptr points to an address that stores "hello"? or does it store the hello itself?
Im so confused, hope you guys can help me..
Thanks in advance.

ptr holds the address to where the literal "hello" is stored at. In this case, it points to a string literal. It's an immutable array of characters located in static (most commonly read-only) memory.
You can make ptr point to something else by re-assigning it, but before you do, modifying the contents is illegal. (its type is actually const char*, the conversion to char* is deprecated (and even illegal in C++11) for C compatibility.
Because of this guarantee, the compiler is free to optimize for space, so
char * ptr = "hello";
char * ptr1 = "hello";
might yield two equal pointers. (i.e. ptr == ptr1)

The pointer is pointing to the address where "hello" is stored. More precisely it is pointing the 'h' in the "hello".

"hello" is a string literal: a static array of characters. Like all arrays, it can be converted to a pointer to its first element, if it's used in a context that requires a pointer.
However, the array is constant, so assigning it to char* (rather than const char*) is a very bad idea. You'll get undefined behaviour (typically an access violation) if you try to use that pointer to modify the string.

The compiler will "find somewhere" that it can put the string "hello", and the ptr will have the address of that "somewhere".

When you create a new char* by assigning it a string literal, what happens is char* gets assigned the address of the literal. So the actual value of char* might be 0x87F2F1A6 (some hex-address value). The char* points to the start (in this case the first char) of the string. In C and C++, all strings are terminated with a /0, this is how the system knows it has reached the end of the String.

char* text = "Hello!" can be thought of as the following:
At program start, you create an array of chars, 7 in length:
{'H','e','l','l','o','!','\0'}. The last one is the null character and shows that there aren't any more characters after it. [It's more efficient than keeping a count associated with the string... A count would take up perhaps 4 bytes for a 32-bit integer, while the null character is just a single byte, or two bytes if you're using Unicode strings. Plus it's less confusing to have a single array ending in the null character than to have to manage an array of characters and a counting variable at the same time.]
The difference between creating an array and making a string constant is that an array is editable and a string constant (or 'string literal') is not. Trying to set a value in a string literal causes problems: they are read-only.
Then, whenever you call the statement char* text = "Hello!", you take the address of that initial array and stick it into the variable text. Note that if you have something like this...
char* text1 = "Hello!";
char* text2 = "Hello!";
char* text3 = "Hello!";
...then it's quite possible that you're creating three separate arrays of {'H','e','l','l','o','!','\0'}, so it would be more efficient to do this...
char* _text = "Hello!";
char* text1 = _text;
char* text2 = _text;
char* text3 = _text;
Most compilers are smart enough to only initialize one string constant automatically, but some will only do that if you manually turn on certain optimization features.
Another note: from my experience, using delete [] on a pointer to a string literal doesn't cause issues, but it's unnecessary since as far as I know it doesn't actually delete it.

What is a char*?

Why do we need the *?
char* test = "testing";
From what I understood, we only apply * onto addresses.

This is a char:
char c = 't';
It can only hold one character!
This is a C-string:
char s[] = "test";
It can hold multiple characters. Another way to write the above is:
char s[] = {'t', 'e', 's', 't', 0};
The 0 at the end is called the NUL terminator. It denotes the end of a C-string.
A char* stores the starting memory location of a C-string.1 For example, we can use it to refer to the same array s that we defined above. We do this by setting our char* to the memory location of the first element of s:
char* p = &(s[0]);
The & operator gives us the memory location of s[0].
Here is a shorter way to write the above:
char* p = s;
Notice:
*(p + 0) == 't'
*(p + 1) == 'e'
*(p + 2) == 's'
*(p + 3) == 't'
*(p + 4) == 0 // NUL
Or, alternatively:
p[0] == 't'
p[1] == 'e'
p[2] == 's'
p[3] == 't'
p[4] == 0 // NUL
Another common usage of char* is to refer to the memory location of a string literal:
const char* myStringLiteral = "test";
Warning: This string literal should not be changed at runtime. We use const to warn the programmer (and compiler) not to modify myStringLiteral in the following illegal manner:
myStringLiteral[0] = 'b'; // Illegal! Do not do this for const char*!
This is different from the array s above, which we are allowed to modify. This is because the string literal "test" is automatically copied into the array at initialization phase. But with myStringLiteral, no such copying occurs. (Where would we copy to, anyways? There's no array to hold our data... just a lonely char*!)
1 Technical note: char* merely stores a memory location to things of type char. It can certainly refer to just a single char. However, it is much more common to use char* to refer to C-strings, which are NUL-terminated character sequences, as shown above.

The char type can only represent a single character. When you have a sequence of characters, they are piled next to each other in memory, and the location of the first character in that sequence is returned (assigned to test). Test is nothing more than a pointer to the memory location of the first character in "testing", saying that the type it points to is a char.

You can do one of two things:
char *test = "testing";
or:
char test[] = "testing";
Or, a few variations on those themes like:
char const *test = "testing";
I mention this primarily because it's the one you usually really want.
The bottom line, however, is that char x; will only define a single character. If you want a string of characters, you have to define an array of char or a pointer to char (which you'll initialize with a string literal, as above, more often than not).
There are real differences between the first two options though. char *test=... defines a pointer named test, which is initialized to point to a string literal. The string literal itself is allocated statically (typically right along with the code for your program), and you're not supposed to (attempt to) modify it -- thus the preference for char const *.
The char test[] = .. allocates an array. If it's a global, it's pretty similar to the previous except that it does not allocate a separate space for the pointer to the string literal -- rather, test becomes the name attached to the string literal itself.
If you do this as a local variable, test will still refer directly to the string literal - but since it's a local variable, it allocates "auto" storage (typically on the stack), which gets initialized (usually from a normal, statically allocated string literal) on every entry to the block/scope where it's defined.
The latter versions (with an array of char) can act deceptively similar to a pointer, because the name of an array will decay to the address of the beginning of the array anytime you pass it to a function. There are differences though. You can modify the array, but modifying a string literal gives undefined behavior. Conversely, you can change the pointer to point at some other chars, so something like:
char *test = "testing";
if (whatever)
test = "not testing any more";
...is perfectly fine, but trying to do the same with an array won't work (arrays aren't assignable).

The main thing people forgot to mention is that "testing" is an array of chars in memory, there's no such thing as primitive string type in c++. Therefore as with any other array, you can't reference it as if it is an element.

char* represents the address of the beginning of the contiguous block of memory of char's. You need it as you are not using a single char variable you are addressing a whole array of char's
When accessing this, functions will take the address of the first char and step through the memory. This is possible as arrays use contiguous memory (i.e. all of the memory is consecutive in memory).
Hope this clears things up! :)

Using a * says that this variable points to a location in memory. In this case, it is pointing to the location of the string "testing". With a char pointer, you are not limited to just single characters, because now you have more space available to you.

In C a array is represented by a pointer to the first element in it.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js