I was wondering what properties double quotes have, especially in relation to initializing char pointers.
char *ptr="hello";
char array[6]={'h','e','l','l','o'};
cout<<ptr<<endl<<array<<endl;
The above code prints hello twice. I know that using double quotes denotes a string, or a char array with a null character at the end. Since ptr is looking for a memory address (char*) to be assigned to, I'm guessing that the "hello" resolves to the memory address of the 'h', despite the fact that you are also filling in char values for the rest of the array? If this is the case, does that mean that in the above code
char *ptr="hello";
the double quotes creates a string somewhere in memory and then ptr is assigned to the first element of that string, whereas
char array[6]={'h','e','l','l','o'};
creates an array somewhere in memory and then assigns values for each index based on the right hand side of the assignment operator?
There are two things to note here of importance.
char array[6]={'h','e','l','l','o'};
This will allocate 6 bytes on the stack and initialize them to "hello\0"; The following are equlivalent:
char array[] = "hello";
char array[6] = "hello";
However, the below is different.
char *ptr="hello";
This will allocate a pointer on the stack, that points to the constant string "hello\0". This is an important distinction, if you alter the value ptr points to, you will cause undefined behavior as you will be altering the constant value it points to.
Example:
#include <stdio.h>
void testLocal()
{
char s[] = "local"; // locally allocated 6 bytes initialized to "local\0"
s[0] = 'L';
printf("%s\n", s);
}
void testPointer()
{
char *s = "pointer"; // locally allocated pointer, pointing to a constant
// This segfaults on my system, but really the behavior is undefined.
s[0] = 'P';
printf("%s\n", s);
}
int main(int argc, char * argv[])
{
testLocal();
testPointer();
}
For strings, there is a special terminal character \0 which is added to the end. This tells it that that's the end of the string.
So, if you have a string "hello", it'll keep reading each character: "h", "e", "l", "l", "o", "\0", which tells it to stop.
The character array is similar, but doesn't have that terminal character. Instead, the length of the array itself is what indicates how many characters to read (which won't necessarily work for all methods).
Related
I attempted to initialize char array with NULL like this syntax.
char str[5] = NULL;
But it returned error..
How can I initialize char array with NULL or 0 in C++?
Concretely, I want to print "Hello" in this example code.
#include <iostream>
int main()
{
char str[5] = NULL;
if(!str)
std::cout << "Hello" << std::endl;
return 0;
}
This code will return error because of incorrect initialization. Then, what initializing sentence should I replace sentence with?
An array can not be null. Null is state of a pointer, and an array is not a pointer.
In the expression !str, the array will decay to the address of the first element. The type of this decayed address is a pointer, but you cannot modify the decayed pointer. Since the array is never stored in the address 0 (except maybe in some special case on an embedded system where an array might actually be stored at that memory location), !str will never be true.
What you can do, is initialize all of the sub-objects of the array to zero. This can be achieved using the value-initialization syntax:
char str[5]{};
As I explained earlier, !str is still not meaningful, and is always false. One sensible thing that you might do is check whether the array contains an empty string. An empty string contains nothing except the terminator character (the array can have elements after the terminator, but those are not part of the string).
Simplest way to check whether a string is empty is to check whether the first character is the terminator (value-initialized character will be a terminator, so a value initialized character array does indeed contain the empty string):
if (!str[0]) // true if the string is empty
//...
char str[5] = {'\0'};
if (str[0] != '\0')
//...
If you now put some characters into str it will print str up to the last '\0'.
Every string literal ends with '\0', you must make sure your array ends with '\0' too, if not, data will be read beyond your array (until '\0' is encountered) and possibly beyond your application's memory space in which case your app will crash.
You should use std::string or QString however if you need a string.
You can't initialise a char array with NULL, arrays can never be NULL. You seem to be mixing up pointers and arrays. A pointer could be initialised with NULL.
You can initialise the chars of your array with 0, that would be
char str[5] = {0};
but I don't think that's what you're asking.
#include <stdio.h>
using namespace std;
int main() {
char str[5];
for(int i=0;i<5;i++){
str[i]=NULL;
}
printf("success");
return 0;
}
Hope this helps.
I'm studying on pointers and I'm stuck when I see char *p[10]. Because something is misunderstood. Can someone explain step-by-step and blow-by-blow why my logic is wrong and what the mistakes are and where did I think wrong and how should I think. Because I want to learn exactly. Also what about int *p[10]; ? Besides, for example x is a pointer to char but just char not chars. But how come char *x = "possible";
I think above one should be right but, I have seen for char *name[] = { "no month","jan","feb" }; I am really confused.
Your char *p[10] diagram shows an array where each element points to a character.
You could construct it like this:
char f = 'f';
char i = 'i';
char l1 = 'l';
char l2 = 'l';
char a1 = 'a';
char r1 = 'r';
char r2 = 'r';
char a2 = 'a';
char y = 'y';
char nul = '\0';
char *p[10] = { &f, &i, &l1, &l2, &a1, &r1, &r2, &a2, &y, &nul };
This is very different from the array
char p[10] = {'f', 'i', 'l', 'l', 'a', 'r', 'r', 'a', 'y', '\0'};
or
char p[10] = "fillarray";
which are arrays of characters, not pointers.
A pointer can equally well point to the first element of an array, as you've probably seen in constructions like
const char *p = "fillarray";
where p holds the address of the first element of an array defined by the literal.
This works because an array can decay into a pointer to its first element.
The same thing happens if you make an array of pointers:
/* Each element is a pointer to the first element of the corresponding string in the initialiser. */
const char *name[] = { "no month","jan","feb" };
You would get the same results with
const char* name[3];
name[0] = "no month";
name[1] = "jan";
name[2] = "feb";
char c = 'a';
Here, c is a char, typically a single byte of ASCII encoded data.
char* ptr = &c;
ptr is a char pointer. In C, all it does is point to a memory location and doesn't make any guarantees about what is at that location. You could use a char* to pass a char to a function to allow the function to allow the function to make changes to that char (pass by reference).
A common C convention is for a char* to point to a memory location where several characters are stored in sequence followed by the null character \0. This convention is called a C string:
char const* cstr = "hello";
cstr points to a block of memory 6 bytes long, ending with a null character. The data itself cannot be modified, though the pointer can be changed to point to something else.
An array of chars looks similar, but behaves slightly differently.
char arr[] = "hello";
Here arr IS a memory block of 6 chars. Since arr represents the memory itself, it cannot be changed to point to another location. The data can be modified though.
Now,
char const* name[] = { "Jan", " Feb"..., "Dec"};
is an array of pointer to characters.
name is a block of memory, each containing a pointer to a null-terminated string.
In the diagram, I think string* was accidentally used instead of char*. The difference between the left and the right, is not a technical difference really, but a difference in the way a char* is used. On the left each char* points to a single character, whereas in the one on the right, each char* points to a null-terminated block of characters.
Both are right.
A pointer in C or C++ may point either to a single item (a single char) or to the first in an array of items (char[]).
So a char *p[10]; definition may point to 10 single characters or 10 arrays (i.e. 10 strings).
Let’s go back to basics.
First, char *p is simply a pointer. p contains nothing more than a memory address. That memory address can point to anything, anywhere. By convention, we have always used NULL (or, I hate this method, assigning it to zero – yeah, they are the same “thing”, but NULL has traditionally been used in conjunction with pointers, so when you’re eyes flit across the code, you see NULL – you think “pointer”).
Anyway, that memory address being pointed to can contain anything. So, to use within the language, we type it, in this case it is a pointer to a character (char *p). This can be overridden by type casting, but that’s for a later time.
Second, we know anytime we see p[10], that we are dealing with an array. Again, the array can be an array of characters, an array of ints, etc. – but it’s still an array.
Your example: char *p[10], is then nothing more than an array of 10 character pointers. Nothing more, nothing less. Your problem comes in because you are trying to force the “string” concept onto this. There ain’t no strings in C. There ain’t no objects in C. The concept of a NULL-terminated string can most certainly be used. But a “string” in C is nothing more than an array of characters, terminated by a NULL (or, if you use some of the appropriate functions, you can use a specific number of characters – strncpy instead of strcpy, etc.). But, for all its appearance, and apparent use, there are no strings in C. They are nothing more than arrays of characters, with a few supporting functions that happen to stop going through the array when a NULL is encountered.
So – char a[10] – is simply an array of characters that is 10 characters long. You can fill it with any characters you wish. If one of those is the NULL character, then that terminates what is typically called a “C-style string”. There are functions that support this type of character array (i.e. “string”), but it is still a use of a character array.
Your confusion comes in because you are trying to mix C++ string objects, and forcing that concept onto C arrays of characters. As ugoren noted – your examples are both correct – because you are dealing with arrays of character pointers, NOT strings. Again, putting a NULL somewhere in that character array is happily supported by several C functions that give you the ability to work with a “string-like” concept – but they are not truly strings. Unless of course, you want to phrase it that a string is nothing more than one character following another – an array.
How should I understand char * ch="123"?
'1' is a char, so I can use:
char x = '1';
char *pt = &x;
But how do I understand char *pt="123"? Why can the char *pt point to string?
Is pt's value the first address value for "123"? If so, how do I get the length for the string pointed to by pt?
That is actually a really good question, and it is the consequence of several oddities in the C language:
1: A pointer to a char (char*) can of course also point to a specific char in an array of chars. That is what pointer arithmetic relies on:
// create an array of three chars
char arr[3] = { 'a', 'b', 'c'};
// point to the first char in the array
char* ptr = &arr[0]
// point to the third char in the array
char* ptr = &arr[2]
2: A string literal ("foo") is actually not a string as such, but simply an array of chars, followed by a null byte. (So "foo" is actually equivalent to the array {'f', 'o', 'o', '\0'})
3: In C, arrays "decay" into pointers to the first element. (This is why many people incorrectly says that "there is no difference between arrays and pointers in C"). That is, when you try to assign an array to a pointer object, it sets the pointer to point to the first element of the array. So given the array arr declared above, you can do char* ptr = arr, and it means the same as char* ptr = &arr[0].
4: In every other case, syntax like this would make the pointer point to an rvalue (loosely speaking, a temporary object, which you can't take the address of), which is generally illegal. (You can't do int* ptr = &42). But when you define a string literal (such as "foo"), it does not create an rvalue. Instead, it creates the char array with static storage. You're creating a static object, which is created when the program is loaded, and of course a pointer can safely point to that.
5: String literals are actually required to be marked as const (because they are static and read-only), but because early versions of C did not have the const keyword, you are allowed to omit the const specifier (at least prior to C++11), to avoid breaking old code (but you still have to treat the variable as read-only).
So char* ch = "123" really means:
write the char array {'1', '2', '3', '\0'} into the static section of the executable (so that when the program is loaded into memory, this variable is created in a read-only section of memory)
when this line of code is executed, create a pointer which points to the first element of this array
As a bonus fun fact, this differs from char ch[] = "123";, which instead means
write the char array {'1', '2', '3', '\0'} into the static section of the executable (so that when the program is loaded into memory, this variable is created in a read-only section of memory)
when this line of code is executed, create an array on the stack which contains a copy of this statically allocated array.
char* ptr = "123"; is compatible and almost equivalent to char ptr[] = { '1', '2', '3', '\0' }; (see http://ideone.com/rFOk3R).
In C a pointer can point to one value or an array of contiguous values. C++ inherited this.
So a string is just an array of character (char) ended by a '\0'. And a pointer to char can point to an array of char.
The length is given by the number of character between the begining and the terminal '\0'. Exemple of C strlen giving you the length of the string:
size_t strlen(const char * str)
{
const char *s;
for (s = str; *s; ++s) {}
return(s - str);
}
An yes it fails horribly if there is no '\0' at the end.
A string literal is an array of N const char where N is the length of the literal including the implicit NUL terminator. It has static storage duration and it's implementation defined where it is stored. From here on, it's the same a with a normal array - it decays to a pointer to its first character - that's a const char*. What you have there is not legal (not anymore since onset of C++11 standard) in C++, it should be const char* ch = "123";.
You can get the length of a literal with sizeof operator. Once it decays to a pointer, though, you need to iterate through it and find the terminator (that's what strlen function does).
So, with a const char* ch; you get a pointer to a constant character type that can point to a single character, or to the start of an array of characters (or anywhere between the start and the end). The array can be dynamically, autimatically or statically allocated and can be mutable or not.
In something like char ch[] = "text"; you have an array of characters. This is syntatic sugar for a normal array initializer (as in char ch[] = {'t','e','x','t','\0'}; but note that the literal will still be loaded at the start of the program). What hapens here is:
an array with automatic storage duration is allocated
its size is deduced from the size of the literal by the compiler
the contents of the literal are copied to the array
As a result, you have a region of storage that you can use at will (unlike literals, which must not be written into).
There are no strings in C, but there are pointers to characters.
*pt is indeed not pointing to a string, but to a single characters (the '1').
However, some functions take char* as argument assume that the byte on the address following the address that their argument points to, is set to 0 if they are not to operate on it.
In your example, if you tried using pt on a function which expects a "null terminated string" (basically, which expects that it will encounter a byte with a value of 0 when it should stop processing data) you will run into a segmentation fault, as x='1' gives x the ascii value of the 1 character, but nothing more, whereas char* pt="123" gives pt the value of the address of 1, but also puts into that memory, the bytes containing ascii values of 1, 2,3 followed by a byte with a value of 0 (zero).
So the memory (in a 8 bit machine) may look like this:
Address = Content (0x31 is the Ascii code for the character 1 (one))
0xa0 = 0x31
0xa1 = 0x32
0xa2 = 0x33
0xa3 = 0x00
Let's suppose that you in the same machine char* otherString = malloc(4),suppose that malloc returns a value of 0xb0, which is now the value of otherString, and we wanted to copy our "pt" (which would have a value of 0xa0) into otherString, the strcpy call would look like so:
strcpy( otherString, pt );
The same as
strcpy( 0xb0, 0x0a );
strcpy would then take the value of address 0xa0 and copy it into 0xb0, it would increment it's pointers to "pt" to 0xa1, check if 0xa1 is zero, if it is not zero, it would increment it's pointer to "otherString" and copy 0xa1 into 0xb1, and so on, until it's "pt" pointer is 0xa3, in this case, it will return as it detected that the end of the "string" has been reached.
This is of cause, not 100% how it goes on, and it could be implemented in many different ways.
Here is one http://fossies.org/dox/glibc-2.18/strcpy_8c_source.html
A pointer to an array?
A pointer points to only one memory address. The phrase that a pointer points to an array is only used in a loose sense---a pointer cannot really store multiple addresses at the same time.
In your example, char *ch="123", the pointer ch is really pointing to the first byte only. You can write code like the following, and it will make perfect sense:
char *ch = new char [1024];
sprintf (ch, "Hello");
delete [] ch;
char x = '1';
ch = &x;
Please note the use of the pointer ch to point to both the memory allocated by new char [1024] line as well as the address of the variable x, while still being the same pointer type.
C-style strings are null terminated
Strings in C used to be null terminated, i.e., a special '\0' was added to the end of the string and assumed to be there for all char * based functions (such as strlen and printf) This way, you can determine the length of the string by starting at the first byte and continue till you find the byte containing 0x00.
A verbose, sample implementation of anstrlen style function would be
int my_strlen (const char *startAddress)
{
int count = 0;
char *ptr = startAddress;
while (*ptr != 0)
{
++count;
++ptr;
}
return count;
}
char* pt = "123"; does two things:
1. creates the string literal "123" in ROM (this is usually in .text section)
2. creates a char* which is assigned the beginning of memory location where the string is located.
because of this operations like pt[1] = '2'; are illegal as you would be attempting to write to ROM memory.
But you can assign the pointer to some other memory location without any problems.
Code:
#include <stdio.h>
int main() {
char *str;
char i = 'a';
str = &i;
str = "Hello";
printf("%s, %c, %x, %x", str, i, str, &i);
return 0;
}
I get this output:
Hello, a, 403064, 28ff0b
I have following two doubts:
How can I store a string without allocating any memory for it. str is a character pointer and is pointing to where char variable i. When I add str = "Hello"; aren't I using 5 bytes from that location 4 of which are not allocated?
Since, I code str = &i; shouldn't str and &i have same value when I printf them? When I remove the str = "Hello"; statement str and &i are same. And if str and &i are same then I believe when I say str = "Hello" it should overwrite 'a' with 'H' and the rest 'ello\0' come into the subsequent bytes.
I believe the whole problem is with str = "Hello" statement. It doesn't seem to be working like what I think.
Please someone explain how it works??
When the compiler encounters a string literal, in this case "Hello", memory is allocated in the static (global) memory area. This "allocation" is done before your program executes.
When your program starts executing at main, a stack frame is allocated to store the local variables of main: str and i. Note that str is a simple variable that just stores an address. It does not store any characters. It just stores a pointer.
The statement str = &i; writes into variable str the address of i.
The statement str = "Hello" writes into the variable str, the address of the string literal "Hello" which has been pre-allocated by the compiler. This is a completely different address than that of i. That assignment does not move any of the characters in the word "Hello" anywhere at all.
TL;DR the value of a "string" variable in C is just a pointer. Assigning to a string variable is assigning a number, namely an address.
The compiler writes the sequence of bytes { 'H', 'E', 'L', 'L', 'O', '\0' } in a section of the executable called the data segment.
When the application runs, it takes the address of these bytes and stores them in the variable representing 'str'.
It doesn't have to "allocate" memory in the sense that, at run time, the program does not have to ask the OS for memory to store the text.
You should try to avoid treating string literals like this as non-const. GCC's "-Wall" option promotes assignment of string literals to "char*" pointers as
warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
Many compilers, when compiling with optimization, will do something called "string pooling", which avoids duplicating strings.
const char* str1 = "hello";
const char* str2 = "hello";
If compiled with string pooling, the executable might only contain one instance of "hello".
When I say str = "Hello" aren't I using 5 bytes from that location 4
of which are not allocated?
No. The compiler sets aside 6 bytes (remember the null terminator) in a different part of memory (a read-only part, but that's an answer to a different question). The assignment:
str = "Hello";
causes str to point to the location of the first of these 6 bytes, the H.
Since, I said str=&i; shoudln't str and &i have same value when I
printf them?
Yes, but you set str to point to something else on the very next line, before you printed anything.
The literal string "Hello" is taking 6 bytes of memory somewhere, probably in the program space. Assigning the pointer to it just sets the pointer to where the string already exists. It doesn't copy the characters at all.
If you wanted to copy the characters you would need to use strcpy, but as you did not set the pointer to a writable buffer of sufficient size it would cause undefined behavior if you did.
str is a pointer to a char. When you say str = &i; you are pointing it to an int. When you say str = "Hello"; you are changing str to point to a location where the character sequence 'H', 'e', 'l', 'l', 'o', 0 is stored.
But you say str = something else right after you say str = &i
Looks like you haven't quite grasped pointers yet...
Okay, simply put, every string is a pointer in C.
If we run with this:
int main() {
char *str;
char i='a';
str = &i;
str = "Hello";
printf("%s, %c, %x, %x", str, i, str, &i);
return 0;
}
When you set str = &i, you are making str point to i. Thus, i == *str and &i == str hold true.
When you call str = "Hello";, str now points to a statically allocated array of size 6 (this usually resides directly in your program code). Because str is a pointer, when you reset it to point to the new array, it makes no change to i. Now, if instead of setting str to "Hello", we did *str = 'Z';, i would now have the value of 'Z', while str still points to i.
First how can I store the string without allocating any memory for it. str is a chracter pointer and is pointing to where char i is stored. When I say str = "Hello" aren't I using 5 bytes from that location 4 of which are not allocated?
The memory for the string is allocated by the compiler. I don't think the standard specifies exactly how the compiler must do this. If you run 'strings' against your executable, you should find "Hello" in there somewhere.
Since, I said str=&i; shoudln't str and &i have same value when I printf them? When I remove the str = "Hello" statement str and &i are same. And if str and &i are same then I believe when I say str="Hello" it should overwrite 'a' with 'H' and the rest 'ello\0' come into the subsequent bytes.
I think what you are missing here is that str = "Hello" does not copy the string to the location pointed to by str. It changes what str points to. "Hello" is in memory and you are assigning that memory location to the pointer. If you want to copy memory to a pointer, you need to use memcpy() or something similar.
Why do we need the *?
char* test = "testing";
From what I understood, we only apply * onto addresses.
This is a char:
char c = 't';
It can only hold one character!
This is a C-string:
char s[] = "test";
It can hold multiple characters. Another way to write the above is:
char s[] = {'t', 'e', 's', 't', 0};
The 0 at the end is called the NUL terminator. It denotes the end of a C-string.
A char* stores the starting memory location of a C-string.1 For example, we can use it to refer to the same array s that we defined above. We do this by setting our char* to the memory location of the first element of s:
char* p = &(s[0]);
The & operator gives us the memory location of s[0].
Here is a shorter way to write the above:
char* p = s;
Notice:
*(p + 0) == 't'
*(p + 1) == 'e'
*(p + 2) == 's'
*(p + 3) == 't'
*(p + 4) == 0 // NUL
Or, alternatively:
p[0] == 't'
p[1] == 'e'
p[2] == 's'
p[3] == 't'
p[4] == 0 // NUL
Another common usage of char* is to refer to the memory location of a string literal:
const char* myStringLiteral = "test";
Warning: This string literal should not be changed at runtime. We use const to warn the programmer (and compiler) not to modify myStringLiteral in the following illegal manner:
myStringLiteral[0] = 'b'; // Illegal! Do not do this for const char*!
This is different from the array s above, which we are allowed to modify. This is because the string literal "test" is automatically copied into the array at initialization phase. But with myStringLiteral, no such copying occurs. (Where would we copy to, anyways? There's no array to hold our data... just a lonely char*!)
1 Technical note: char* merely stores a memory location to things of type char. It can certainly refer to just a single char. However, it is much more common to use char* to refer to C-strings, which are NUL-terminated character sequences, as shown above.
The char type can only represent a single character. When you have a sequence of characters, they are piled next to each other in memory, and the location of the first character in that sequence is returned (assigned to test). Test is nothing more than a pointer to the memory location of the first character in "testing", saying that the type it points to is a char.
You can do one of two things:
char *test = "testing";
or:
char test[] = "testing";
Or, a few variations on those themes like:
char const *test = "testing";
I mention this primarily because it's the one you usually really want.
The bottom line, however, is that char x; will only define a single character. If you want a string of characters, you have to define an array of char or a pointer to char (which you'll initialize with a string literal, as above, more often than not).
There are real differences between the first two options though. char *test=... defines a pointer named test, which is initialized to point to a string literal. The string literal itself is allocated statically (typically right along with the code for your program), and you're not supposed to (attempt to) modify it -- thus the preference for char const *.
The char test[] = .. allocates an array. If it's a global, it's pretty similar to the previous except that it does not allocate a separate space for the pointer to the string literal -- rather, test becomes the name attached to the string literal itself.
If you do this as a local variable, test will still refer directly to the string literal - but since it's a local variable, it allocates "auto" storage (typically on the stack), which gets initialized (usually from a normal, statically allocated string literal) on every entry to the block/scope where it's defined.
The latter versions (with an array of char) can act deceptively similar to a pointer, because the name of an array will decay to the address of the beginning of the array anytime you pass it to a function. There are differences though. You can modify the array, but modifying a string literal gives undefined behavior. Conversely, you can change the pointer to point at some other chars, so something like:
char *test = "testing";
if (whatever)
test = "not testing any more";
...is perfectly fine, but trying to do the same with an array won't work (arrays aren't assignable).
The main thing people forgot to mention is that "testing" is an array of chars in memory, there's no such thing as primitive string type in c++. Therefore as with any other array, you can't reference it as if it is an element.
char* represents the address of the beginning of the contiguous block of memory of char's. You need it as you are not using a single char variable you are addressing a whole array of char's
When accessing this, functions will take the address of the first char and step through the memory. This is possible as arrays use contiguous memory (i.e. all of the memory is consecutive in memory).
Hope this clears things up! :)
Using a * says that this variable points to a location in memory. In this case, it is pointing to the location of the string "testing". With a char pointer, you are not limited to just single characters, because now you have more space available to you.
In C a array is represented by a pointer to the first element in it.