C Style Strings Difference : C/C++ [duplicate] - c++

This question already has answers here:
What is the difference between char a[] = ?string?; and char *p = ?string?;?
(8 answers)
Closed 9 years ago.
What is the difference between C Style Strings
char str[10]="Hello";
char str[]="Hello";
char* str= "Hello";
1) I believe that char str[10]="Hello" is automatic variable and stored on the stack.True? i.e. Allocates 10 bytes on stack.
2) Does char str[]="Hello"; is also stored on stack? i.e. allocates 6 bytes - including null character on stack.
3) Does char* str= "Hello"; stores pointer str on stack and the object "Hello" is stored on heap? i.e. allocates 6 bytes - including null character on heap.
4) All strings (in question 1,2 and 3) are null terminated . True/False?
5) Whether it is C or C++ whenever we create an string like "Hello" , it is always null terminated. Suppose in C++ we declare string str = "Hello"; , is it also null Terminated?
EDIT
Consider All declared in main().
#Negative points and close requests. I am asking this question with respect to where they are stored heap or stack? And also null termination.

"Consider All declared in main()."
Then
1) Yes.
2) Yes.
3) Yes, and no (it's stored neither on the stack nor in the heap in common implementations). "i.e. allocates 6 bytes" -- you seem to have forgotten about the memory required for the pointer. Also, there's an erroneous claim in the comments and in another answer that char* str= "Hello"; is wrong, but in fact it is legal C and, for now, legal C++ ... see What is the type of string literals in C and C++?
4) True, but it would be false if you changed 10 to 5 -- that is, given char str[5]="Hello";, str is not NUL-terminated.
5) False and no (although the implementation might store a NUL following the string -- C++11 requires it -- but that isn't part of the string).
"I am asking this question with respect to where they are stored heap or stack?"
Where do people get the idea that these are the only sorts of memory? Local variables are stored on the stack and memory allocated via malloc or (non-placement) new is allocated from the heap. Program code, file-scope variables, and literals fall into neither of those categories.

You are looking at this kind of sideways, which is probably why you are confused ;-)
1) If these variables are all declared inside a routine definition, without the static keyword, then they are all on the stack.
BUT char str[10] and char str[] are arrays - you get all characters of the array on the stack.
char *str is a pointer to one or more characters. Only the pointer is sure to be on the stack.
2) "Hello" always represents a NULL terminated string in C - it's 6 char's long. If you wanted to initialize a character array to contain a set of characters which is not NULL terminated, you can't do it this way.
3) As people have pointed out in the comments, it's unclear what char *str = "Hello"; does, or even whether it's legal. If it were char const *str = "Hello"; and the compiler accepted it, I'd expect to find the 6 character string somewhere anonymous, global, and possibly protected.
4) I haven't a clue what the "string" class does in C++.

Related

Dynamic memory allocation in C++ language [duplicate]

This question already has answers here:
Char and strcpy in C
(5 answers)
Closed 2 years ago.
I want to allocate space for char array (string) in C++. When I allocate memory for 10 chars, I can also assign more characters to the char array. When I print it, it gives some of the additionally assigned characters from the array.
#include <string.h>
using namespace std;
int main()
{
char *name = new char[10];
strcpy(name, "MoreThanTenCharacters");
cout << name << endl;
}
Although the allocated memory is for 10 characters, I can assign more. Printing gives exactly the same value. What is the logic behind it?
When the buffer pointed by the first argument is shorter than the string pointed by the second argument (taking the null terminator into consideration), then the copy will overflow the buffer into surrounding memory, and the behaviour of the program is undefined.
Printing gives exactly the same value. What is the logic behind it?
You've observed some behaviour. This is an example of possible behaviours that the program could have when the behaviour is undefined.
So, what is the correct way of allocating memory for JUST 10 chars and assigning a string to it?
Your allocation is correct although not ideal. Using a bare pointer is unsafe; in the end you leak the allocation. It's the copying where your bug happens.
An efficient and simple option is to use std::string. If your goal is to store the 10 character long prefix substring of the input, then following would be correct:
std::string name("MoreThanTenCharacters", 10);

confusion about char pointer in c++

I'm new in c++ language and I am trying to understand the pointers concept.
I have a basic question regarding the char pointer,
What I know is that the pointer is a variable that stores an address value,
so when I write sth like this:
char * ptr = "hello";
From my basic knowledge, I think that after = there should be an address to be assigned to the pointer, but here we assign "hello" which is set of chars.
So what does that mean ?
Is the pointer ptr points to an address that stores "hello"? or does it store the hello itself?
Im so confused, hope you guys can help me..
Thanks in advance.
ptr holds the address to where the literal "hello" is stored at. In this case, it points to a string literal. It's an immutable array of characters located in static (most commonly read-only) memory.
You can make ptr point to something else by re-assigning it, but before you do, modifying the contents is illegal. (its type is actually const char*, the conversion to char* is deprecated (and even illegal in C++11) for C compatibility.
Because of this guarantee, the compiler is free to optimize for space, so
char * ptr = "hello";
char * ptr1 = "hello";
might yield two equal pointers. (i.e. ptr == ptr1)
The pointer is pointing to the address where "hello" is stored. More precisely it is pointing the 'h' in the "hello".
"hello" is a string literal: a static array of characters. Like all arrays, it can be converted to a pointer to its first element, if it's used in a context that requires a pointer.
However, the array is constant, so assigning it to char* (rather than const char*) is a very bad idea. You'll get undefined behaviour (typically an access violation) if you try to use that pointer to modify the string.
The compiler will "find somewhere" that it can put the string "hello", and the ptr will have the address of that "somewhere".
When you create a new char* by assigning it a string literal, what happens is char* gets assigned the address of the literal. So the actual value of char* might be 0x87F2F1A6 (some hex-address value). The char* points to the start (in this case the first char) of the string. In C and C++, all strings are terminated with a /0, this is how the system knows it has reached the end of the String.
char* text = "Hello!" can be thought of as the following:
At program start, you create an array of chars, 7 in length:
{'H','e','l','l','o','!','\0'}. The last one is the null character and shows that there aren't any more characters after it. [It's more efficient than keeping a count associated with the string... A count would take up perhaps 4 bytes for a 32-bit integer, while the null character is just a single byte, or two bytes if you're using Unicode strings. Plus it's less confusing to have a single array ending in the null character than to have to manage an array of characters and a counting variable at the same time.]
The difference between creating an array and making a string constant is that an array is editable and a string constant (or 'string literal') is not. Trying to set a value in a string literal causes problems: they are read-only.
Then, whenever you call the statement char* text = "Hello!", you take the address of that initial array and stick it into the variable text. Note that if you have something like this...
char* text1 = "Hello!";
char* text2 = "Hello!";
char* text3 = "Hello!";
...then it's quite possible that you're creating three separate arrays of {'H','e','l','l','o','!','\0'}, so it would be more efficient to do this...
char* _text = "Hello!";
char* text1 = _text;
char* text2 = _text;
char* text3 = _text;
Most compilers are smart enough to only initialize one string constant automatically, but some will only do that if you manually turn on certain optimization features.
Another note: from my experience, using delete [] on a pointer to a string literal doesn't cause issues, but it's unnecessary since as far as I know it doesn't actually delete it.

Copy string form char pointer to char pointer

char * p_one = "this is my first char pointer";
char * p_two= "this is second";
strcpy(p_one ,p_two);
consider the above code. This is giving access violation error.
So please help to understand
where is the current "this is my first char pointer" string stored in memory? heap or stack
why I need to allocate memory for p_one before call strcpy, even it's already storing the first string. why "this is second" string cannot copy to same location?
If I allocate memory for p_one before call strcpy then what happen to "this is my first char pointer" string that was pointed by p_one ? is it keep in memory?
How strcpy knows specific pointer have allocated memory or not?
Implementation defined(usually read only) memory.[Ref 1]
You do not need to as long as you don't modify the source string literal.
If you allocate memory to p_one, then it will point to the newly allocated memory region, the string literal may/may not stay in the memory, but it is guaranteed to be alive throughout the lifetime of the program.String literals have static duration lifetime.[Ref 2]
It doesn't. It is users responsibility to ensure that.
Good Read:
[Ref 1]
What is the difference between char a[] = ?string?; and char *p = ?string?;?
[Ref 2]
"life-time" of string literal in C
First off your compiler should be warning that the p_one and p_two are actually const char * because the compiler allocates the storage of this string at compile time.
The reason you cannot modify them is because in theory you could overwrite memory after them, this is what causes hack attack with a stackoverflow.
Also the compiler could be smart and realize that you you use this string in 10 places but notices it is the same, so modifying from one place changes it - but that destroys the logic of the other 9 places that uses it
Answering all the questions in order
It's bit straight forward that your char pointer is always stored in stack. Remember even though you are using Memory allocation, it is only for determining the length of the string and appending the '\0' character.
This would be one solution, according to code you have mentioned:
int main()
{
char * p_one = "this is my first char pointer";
char * p_two= "this is second";
size_t keylen=strlen(p_two);
p_one=(char *)malloc(keylen*sizeof(char));
strncpy(p_one ,p_two,strlen(p_one));
printf("%s",p_one);
return 0;
}
When you have declared a char pointer it only points to the memory allocation. So string copy doesn't point to the end of character. Hence it is always better to use strncpy, in this conditions.
Yes it is allocating memory.
it is bad practice to cast the result of malloc as you will inhibit possible runtime errors being thrown, thanks Gewure
When you have a string literal in your code like that, you need to think of it as a temporary constant value. Sure, you assigned it to a char*, but that does not mean you are allowed to modify it. Nothing in the C specification says this is legal.
On the other hand, this is okay:
const size_t MAX_STR = 50;
char p_one[MAX_STR] = "this is my first char pointer";
const char *p_two = "this is second";
strcpy( p_one, p_two );

What is a char*?

Why do we need the *?
char* test = "testing";
From what I understood, we only apply * onto addresses.
This is a char:
char c = 't';
It can only hold one character!
This is a C-string:
char s[] = "test";
It can hold multiple characters. Another way to write the above is:
char s[] = {'t', 'e', 's', 't', 0};
The 0 at the end is called the NUL terminator. It denotes the end of a C-string.
A char* stores the starting memory location of a C-string.1 For example, we can use it to refer to the same array s that we defined above. We do this by setting our char* to the memory location of the first element of s:
char* p = &(s[0]);
The & operator gives us the memory location of s[0].
Here is a shorter way to write the above:
char* p = s;
Notice:
*(p + 0) == 't'
*(p + 1) == 'e'
*(p + 2) == 's'
*(p + 3) == 't'
*(p + 4) == 0 // NUL
Or, alternatively:
p[0] == 't'
p[1] == 'e'
p[2] == 's'
p[3] == 't'
p[4] == 0 // NUL
Another common usage of char* is to refer to the memory location of a string literal:
const char* myStringLiteral = "test";
Warning: This string literal should not be changed at runtime. We use const to warn the programmer (and compiler) not to modify myStringLiteral in the following illegal manner:
myStringLiteral[0] = 'b'; // Illegal! Do not do this for const char*!
This is different from the array s above, which we are allowed to modify. This is because the string literal "test" is automatically copied into the array at initialization phase. But with myStringLiteral, no such copying occurs. (Where would we copy to, anyways? There's no array to hold our data... just a lonely char*!)
1 Technical note: char* merely stores a memory location to things of type char. It can certainly refer to just a single char. However, it is much more common to use char* to refer to C-strings, which are NUL-terminated character sequences, as shown above.
The char type can only represent a single character. When you have a sequence of characters, they are piled next to each other in memory, and the location of the first character in that sequence is returned (assigned to test). Test is nothing more than a pointer to the memory location of the first character in "testing", saying that the type it points to is a char.
You can do one of two things:
char *test = "testing";
or:
char test[] = "testing";
Or, a few variations on those themes like:
char const *test = "testing";
I mention this primarily because it's the one you usually really want.
The bottom line, however, is that char x; will only define a single character. If you want a string of characters, you have to define an array of char or a pointer to char (which you'll initialize with a string literal, as above, more often than not).
There are real differences between the first two options though. char *test=... defines a pointer named test, which is initialized to point to a string literal. The string literal itself is allocated statically (typically right along with the code for your program), and you're not supposed to (attempt to) modify it -- thus the preference for char const *.
The char test[] = .. allocates an array. If it's a global, it's pretty similar to the previous except that it does not allocate a separate space for the pointer to the string literal -- rather, test becomes the name attached to the string literal itself.
If you do this as a local variable, test will still refer directly to the string literal - but since it's a local variable, it allocates "auto" storage (typically on the stack), which gets initialized (usually from a normal, statically allocated string literal) on every entry to the block/scope where it's defined.
The latter versions (with an array of char) can act deceptively similar to a pointer, because the name of an array will decay to the address of the beginning of the array anytime you pass it to a function. There are differences though. You can modify the array, but modifying a string literal gives undefined behavior. Conversely, you can change the pointer to point at some other chars, so something like:
char *test = "testing";
if (whatever)
test = "not testing any more";
...is perfectly fine, but trying to do the same with an array won't work (arrays aren't assignable).
The main thing people forgot to mention is that "testing" is an array of chars in memory, there's no such thing as primitive string type in c++. Therefore as with any other array, you can't reference it as if it is an element.
char* represents the address of the beginning of the contiguous block of memory of char's. You need it as you are not using a single char variable you are addressing a whole array of char's
When accessing this, functions will take the address of the first char and step through the memory. This is possible as arrays use contiguous memory (i.e. all of the memory is consecutive in memory).
Hope this clears things up! :)
Using a * says that this variable points to a location in memory. In this case, it is pointing to the location of the string "testing". With a char pointer, you are not limited to just single characters, because now you have more space available to you.
In C a array is represented by a pointer to the first element in it.

Difference between using character pointers and character arrays

Basic question.
char new_str[]="";
char * newstr;
If I have to concatenate some data into it or use string functions like strcat/substr/strcpy, what's the difference between the two?
I understand I have to allocate memory to the char * approach (Line #2). I'm not really sure how though.
And const char * and string literals are the same?
I need to know more on this. Can someone point to some nice exhaustive content/material?
The excellent source to clear up the confusion is Peter Van der Linden, Expert C Programming, Deep C secrets - that arrays and pointers are not the same is how they are addressed in memory.
With an array, char new_str[]; the compiler has given the new_str a memory address that is known at both compilation and runtime, e.g. 0x1234, hence the indexing of the new_str is simple by using []. For example new_str[4], at runtime, the code picks the address of where new_str resides in, e.g. 0x1234 (that is the address in physical memory). by adding the index specifier [4] to it, 0x1234 + 0x4, the value can then be retrieved.
Whereas, with a pointer, the compiler gives the symbol char *newstr an address e.g. 0x9876, but at runtime, that address used, is an indirect addressing scheme. Supposing that newstr was malloc'd newstr = malloc(10);, what is happening is that, everytime a reference in the code is made to use newstr, since the address of newstr is known by the compiler i.e. 0x9876, but what is newstr pointing to is variable. At runtime, the code fetches data from physical memory 0x9876 (i.e. newstr), but at that address is, another memory address (since we malloc'd it), e.g 0x8765 it is here, the code fetches the data from that memory address that malloc assigned to newstr, i.e. 0x8765.
The char new_str[] and char *newstr are used interchangeably, since an zeroth element index of the array decays into a pointer and that explains why you could newstr[5] or *(newstr + 5) Notice how the pointer expression is used even though we have declared char *newstr, hence *(new_str + 1) = *newstr; OR *(new_str + 1) = newstr[1];
In summary, the real difference between the two is how they are accessed in memory.
Get the book and read it and live it and breathe it. Its a brilliant book! :)
Please go through this article below:
Also see in case of array of char like in your case, char new_str[] then the new_str will always point to the base of the array. The pointer in itself can't be incremented. Yes you can use subscripts to access the next char in array eg: new_str[3];
But in case of pointer to char, the pointer can be incremented new_str++ to fetch you the next character in the array.
Also I would suggest this article for more clarity.
This is a character array:
char buf [1000];
So, for example, this makes no sense:
buf = &some_other_buf;
This is because buf, though it has characteristics of type pointer, it is already pointing to the only place that makes sense for it.
char *ptr;
On the other hand, ptr is only a pointer, and may point somewhere. Most often, it's something like this:
ptr = buf; // #1: point to the beginning of buf, same as &buf[0]
or maybe this:
ptr = malloc (1000); // #2: allocate heap and point to it
or:
ptr = "abcdefghijklmn"; // #3: string constant
For all of these, *ptr can be written to—except the third case where some compiling environment define string constants to be unwritable.
*ptr++ = 'h'; // writes into #1: buf[0], #2: first byte of heap, or
// #3 overwrites "a"
strcpy (ptr, "ello"); // finishes writing hello and adds a NUL
The difference is that one is a pointer, the other is an array. You can, for instance, sizeof() array. You may be interested in peeking here
If you're using C++ as your tags indicate, you really should be using the C++ strings, not the C char arrays.
The string type makes manipulating strings a lot easier.
If you're stuck with char arrays for some reason, the line:
char new_str[] = "";
allocates 1 byte of space and puts a null terminator character into it. It's subtly different from:
char *new_str = "";
since that may give you a reference to non-writable memory. The statement:
char *new_str;
on its own gives you a pointer but nothing that it points to. It can also have a random value if it's local to a function.
What people tend to do (in C rather than C++) is to do something like:
char *new_str = malloc (100); // (remember that this has to be freed) or
char new_str[100];
to get enough space.
If you use the str... functions, you're basically responsible for ensuring that you have enough space in the char array, lest you get all sorts of weird and wonderful practice at debugging code. If you use real C++ strings, a lot of the grunt work is done for you.
The type of the first is char[1], the second is char *. Different types.
Allocate memory for the latter with malloc in C, or new in C++.
char foo[] = "Bar"; // Allocates 4 bytes and fills them with
// 'B', 'a', 'r', '\0'.
The size here is implied from the initializer string.
The contents of foo are mutable. You can change foo[i] for example where i = 0..3.
OTOH if you do:
char *foo = "Bar";
The compiler now allocates a static string "Bar" in readonly memory and cannot be modified.
foo[i] = 'X'; // is now undefined.
char new_str[]="abcd";
This specifies an array of characters (a string) of size 5 bytes (one byte for each character plus one for the null terminator). So it stores the string 'abcd' in memory and we can access this string using the variable new_str.
char *new_str="abcd";
This specifies a string 'abcd' is stored somewhere in the memory and the pointer new_str points to the first character of that string.
To differentiate them in the memory allocation side:
// With char array, "hello" is allocated on stack
char s[] = "hello";
// With char pointer, "hello" is stored in the read-only data segment in C++'s memory layout.
char *s = "hello";
// To allocate a string on heap, malloc 6 bytes, due to a NUL byte in the end
char *s = malloc(6);
s = "hello";
If you're in c++ why not use std::string for all your string needs? Especially anything dealing with concatenation. This will save you from a lot of problems.