Can you copy data to `char*` pointer without allocating resource first? - c++

I have seen an example here: http://www.cplusplus.com/reference/string/string/data/
...
std::string str = "Test string";
char* cstr = "Test string";
...
if ( memcmp (cstr, str.data(), str.length() ) == 0 )
std::cout << "str and cstr have the same content.\n";
Question> How can we directly copy data into the location where the pointer cstr pointed to without explicitly allocating space for it?

memcmp is comparing the content of memory pointed to by the two pointers. It does not copy anything.
[EDIT] The question was edited to add a concrete question. The answer is: you need the pointer to point to memory allocated one way or another if you want to copy data there.

How can we directly copy data into the location where the pointer cstr pointed to without explicitly allocating space for it?
You cannot, using an initialisation from a character string literal.
That memory will be placed in the static storage section of your program, and writing there is calling undefined behaviour (merely an exception in your particular compilers implementation).

lets assume that you used memcpy(cstr,x,...) instead of memcmp.
You use phrase 'without allocating resource first'. This really has no meaning.
For memcpy to work cstr must point at valid writable memory.
so the following work
char *cstr = new char[50];
char cstr[50];
char *cstr = malloc(50);
The following might work but shouldnt
char *cstr = "Foo"; // literal is not writable
This will not
char *cstsr = null_ptr;
char *cstrs; // you just might get lucky but very unlikely

Related

pointer to char array, unhandled exception

I'm new to C++ and I'm playing with pointers. I can't figure out why this piece of code doesn't work for me. Can you tell me what's wrong with it?
char * name = "dharman";
char *ptr = name+3;
*ptr = 'a';
printf("%s", name);
I get unhandled exception all the time.
This alone is an error:
char * name = "dharman";
The string is in constant memory but the pointer's type indicates it can be modified. Attempting to modify it produces undefined behavior: on other platforms the program will work but you got unlucky.
This was a quirk in C++03; the newer C++11 spec makes it illegal. The reason it was ever done was C compatibility.
Whether you're writing in C++ or plain C, the solution is simple:
char name[] = "dharman";
Now the compiler stores the data in read-write memory because you have asked for an array of char, not a pointer to some other memory.
String literals, like "dharman", are read-only and you cannot modify them. Instead, create and initialize an array that is not read only.
char name[] = "dharman";
name is a pointer to a string literal "dharman", which is located in read-only memory.
In your statement *ptr = 'a', you are trying to modify this string literal, which results in Undefined Behavior
It doesn't work because "dharman" is constant, it's a string literal. You cannont change it!
String literals are usually placed in read-only segments of memory.
You are trying to modify a constant string.
You need to copy the constant first to some memory that you own.
Try this:
char *name = (char*) malloc(10);
memcpy(name, "dharman", strlen("dharman"));
...
You are setting name to point at a const string and then trying to modify it. Copy the string to a modifiable location:
char *name = (char*) malloc(strlen("dharman") + 1);
memcpy("dharman", name, strlen("dharman") + 1);
The reason for unhandled exception in your case is.In your code
char *ptr= name+3;
consider the base address of name as eg:23300, so char *ptr =name+3 will be equal to
23300+(3*sizeof(char)).so now ptr points to 23300+(3*1)=23303.each element occupies one bye for char so ptr will point to letter 'a' in "dharman".since "dharman" is char const you can't its value that's why you are getting error.if u remove the line *ptr=3.the code will work without out any issue.I hope you find this post useful.

Copy string form char pointer to char pointer

char * p_one = "this is my first char pointer";
char * p_two= "this is second";
strcpy(p_one ,p_two);
consider the above code. This is giving access violation error.
So please help to understand
where is the current "this is my first char pointer" string stored in memory? heap or stack
why I need to allocate memory for p_one before call strcpy, even it's already storing the first string. why "this is second" string cannot copy to same location?
If I allocate memory for p_one before call strcpy then what happen to "this is my first char pointer" string that was pointed by p_one ? is it keep in memory?
How strcpy knows specific pointer have allocated memory or not?
Implementation defined(usually read only) memory.[Ref 1]
You do not need to as long as you don't modify the source string literal.
If you allocate memory to p_one, then it will point to the newly allocated memory region, the string literal may/may not stay in the memory, but it is guaranteed to be alive throughout the lifetime of the program.String literals have static duration lifetime.[Ref 2]
It doesn't. It is users responsibility to ensure that.
Good Read:
[Ref 1]
What is the difference between char a[] = ?string?; and char *p = ?string?;?
[Ref 2]
"life-time" of string literal in C
First off your compiler should be warning that the p_one and p_two are actually const char * because the compiler allocates the storage of this string at compile time.
The reason you cannot modify them is because in theory you could overwrite memory after them, this is what causes hack attack with a stackoverflow.
Also the compiler could be smart and realize that you you use this string in 10 places but notices it is the same, so modifying from one place changes it - but that destroys the logic of the other 9 places that uses it
Answering all the questions in order
It's bit straight forward that your char pointer is always stored in stack. Remember even though you are using Memory allocation, it is only for determining the length of the string and appending the '\0' character.
This would be one solution, according to code you have mentioned:
int main()
{
char * p_one = "this is my first char pointer";
char * p_two= "this is second";
size_t keylen=strlen(p_two);
p_one=(char *)malloc(keylen*sizeof(char));
strncpy(p_one ,p_two,strlen(p_one));
printf("%s",p_one);
return 0;
}
When you have declared a char pointer it only points to the memory allocation. So string copy doesn't point to the end of character. Hence it is always better to use strncpy, in this conditions.
Yes it is allocating memory.
it is bad practice to cast the result of malloc as you will inhibit possible runtime errors being thrown, thanks Gewure
When you have a string literal in your code like that, you need to think of it as a temporary constant value. Sure, you assigned it to a char*, but that does not mean you are allowed to modify it. Nothing in the C specification says this is legal.
On the other hand, this is okay:
const size_t MAX_STR = 50;
char p_one[MAX_STR] = "this is my first char pointer";
const char *p_two = "this is second";
strcpy( p_one, p_two );

unexpected successful copy based on strlen

I was reviewing my skills with pointers and buffer in C++. I tried the code below and everything works fine. No leaks, no crash, nothing.
To be honest I didn't expect this.
When I call char* buf2 = new char[strlen(buf)] I didn't expect srlen(buf) returning the right size. I always thought that strlen
needs a NULL terminated string to work. Here it is not the case so why it is working this code?
int main(){
const char* mystr = "mineminemine";
char* buf = new char[strlen(mystr)];
memcpy(buf, mystr, strlen(mystr));
char* buf2 = new char[strlen(buf)];
memcpy(buf2, buf, strlen(buf));
delete[] buf2;
delete[] buf;
}
That's called undefined behavior - the program appears working but you can't rely on that.
When memory is allocated there happens a null character somewhere that is close enough to the start of the buffer and the program can technically access all memory between that null character and the start of the buffer so you don't observe a crash.
You can't rely on that behavior. Don't write code like that, always allocate enough space to store the terminating null character.
Consider another way to do the same thing:
int main(){
std::string mystr = "mineminemine";
std::string mystr2 = mystr;
}
Internally you have a buffer with a null terminating character added. When you copy a standard string you don't have to worry about keeping track of the start and end of the buffer.
Now considering the lifetime of the strings these two variables are declared on the stack and destroyed when main goes out of scope (e.g. terminationa). If you need strings to be shared amongst objects and you do not necessarily know when they will be destroyed I recommend considering using boost shared pointers.

Dynamic Memory Allocation

I have a small confusion in the dynamic memory allocation concept.
If we declare a pointer say a char pointer, we need to allocate adequate memory space.
char* str = (char*)malloc(20*sizeof(char));
str = "This is a string";
But this will also work.
char* str = "This is a string";
So in which case we have to allocate memory space?
In first sample you have memory leak
char* str = (char*)malloc(20*sizeof(char));
str = "This is a string"; // memory leak
Allocated address will be replaced with new.
New address is an address for "This is a string".
And you should change second sample.
const char* str = "This is a string";
Because of "This is a string" is write protected area.
The presumably C++98 code snippet
char* str = (char*)malloc(20*sizeof(char));
str = "This is a string";
does the following: (1) allocates 20 bytes, storing the pointer to that memory block in str, and (2) stores a pointer to a literal string, in str. You now have no way to refer to the earlier allocated block, and so cannot deallocate it. You have leaked memory.
Note that since str has been declared as char*, the compiler cannot practically detect if you try to use to modify the literal. Happily, in C++0x this will not compile. I really like that rule change!
The code snippet
char* str = "This is a string";
stores a pointer to a string literal in a char* variable named str, just as in the first example, and just as that example it won't compile with a C++0x compiler.
Instead of this sillyness, use for example std::string from the standard library, and sprinkle const liberally throughout your code.
Cheers & hth.,
In the first example, you dynamically allocated memory off the heap. It can be modified, and it must be freed. In the second example, the compiler statically allocated memory, and it cannot be modified, and must not be freed. You must use a const char*, not a char*, for string literals to reflect this and ensure safe usage.
Assigning to a char* variable makes it point to something else, so why did you allocate the memory in the first place if you immediately forget about it? That's a memory leak. You probably meant this:
char* str = (char*)malloc(20*sizeof(char));
strcpy(str, "This is a string");
// ...
free(str);
This will copy the second string to the first.
Since this is tagged C++, you should use a std::string:
#include <string>
std::string str = "This is a string";
No manual memory allocation and release needed, and assignment does what you think it does.
String literals are a special case in the language. Let's look closer at your code to understand this better:
First, you allocate a buffer in memory, and assign the address of that memory to str:
char* str = (char*)malloc(20*sizeof(char));
Then, you assign a string literal to str. This will overwrite what str held previously, so you will lose your dynamically allocated buffer, incidentally causing a memory leak. If you wanted to modify the allocated buffer, you would need at some point to dereference str, as in str[0] = 'A'; str[1] = '\0';.
str = "This is a string";
So, what is the value of str now? The compiler puts all string literals in static memory, so the lifetime of every string literal in the program equals the lifetime of the entire program. This statement is compiled to a simple assignment similar to str = (char*)0x1234, where 0x1234 is supposed to be the address at which the compiler has put the string literal.
That explains why this works well:
char* str = "This is a string";
Please also note that the static memory is not to be changed at runtime, so you should use const char* for this assignment.
So in which case we have to allocate memory space?
In many cases, for example when you need to modify the buffer. In other words; when you need to point to something that could not be a static string constant.
I want to add to Alexey Malistov's Answer by adding that you can avoid memory leak in your first example by copying "This is a string" to str as in the following code:
char* str = (char*)malloc(20*sizeof(char));
strcpy(str, "This is a string");
Please, note that by can I don't mean you have to. Its just adding to an answer to add value to this thread.
In the first example you're just doing things wrong. You allocate dynamic memory on the heap and let str point to it. Then you just let str point to a string literal and the allocated memory is leaked (you don't copy the string into the allocated memory, you just change the address str is pointing at, you would have to use strcpy in the first example).

Difference between using character pointers and character arrays

Basic question.
char new_str[]="";
char * newstr;
If I have to concatenate some data into it or use string functions like strcat/substr/strcpy, what's the difference between the two?
I understand I have to allocate memory to the char * approach (Line #2). I'm not really sure how though.
And const char * and string literals are the same?
I need to know more on this. Can someone point to some nice exhaustive content/material?
The excellent source to clear up the confusion is Peter Van der Linden, Expert C Programming, Deep C secrets - that arrays and pointers are not the same is how they are addressed in memory.
With an array, char new_str[]; the compiler has given the new_str a memory address that is known at both compilation and runtime, e.g. 0x1234, hence the indexing of the new_str is simple by using []. For example new_str[4], at runtime, the code picks the address of where new_str resides in, e.g. 0x1234 (that is the address in physical memory). by adding the index specifier [4] to it, 0x1234 + 0x4, the value can then be retrieved.
Whereas, with a pointer, the compiler gives the symbol char *newstr an address e.g. 0x9876, but at runtime, that address used, is an indirect addressing scheme. Supposing that newstr was malloc'd newstr = malloc(10);, what is happening is that, everytime a reference in the code is made to use newstr, since the address of newstr is known by the compiler i.e. 0x9876, but what is newstr pointing to is variable. At runtime, the code fetches data from physical memory 0x9876 (i.e. newstr), but at that address is, another memory address (since we malloc'd it), e.g 0x8765 it is here, the code fetches the data from that memory address that malloc assigned to newstr, i.e. 0x8765.
The char new_str[] and char *newstr are used interchangeably, since an zeroth element index of the array decays into a pointer and that explains why you could newstr[5] or *(newstr + 5) Notice how the pointer expression is used even though we have declared char *newstr, hence *(new_str + 1) = *newstr; OR *(new_str + 1) = newstr[1];
In summary, the real difference between the two is how they are accessed in memory.
Get the book and read it and live it and breathe it. Its a brilliant book! :)
Please go through this article below:
Also see in case of array of char like in your case, char new_str[] then the new_str will always point to the base of the array. The pointer in itself can't be incremented. Yes you can use subscripts to access the next char in array eg: new_str[3];
But in case of pointer to char, the pointer can be incremented new_str++ to fetch you the next character in the array.
Also I would suggest this article for more clarity.
This is a character array:
char buf [1000];
So, for example, this makes no sense:
buf = &some_other_buf;
This is because buf, though it has characteristics of type pointer, it is already pointing to the only place that makes sense for it.
char *ptr;
On the other hand, ptr is only a pointer, and may point somewhere. Most often, it's something like this:
ptr = buf; // #1: point to the beginning of buf, same as &buf[0]
or maybe this:
ptr = malloc (1000); // #2: allocate heap and point to it
or:
ptr = "abcdefghijklmn"; // #3: string constant
For all of these, *ptr can be written to—except the third case where some compiling environment define string constants to be unwritable.
*ptr++ = 'h'; // writes into #1: buf[0], #2: first byte of heap, or
// #3 overwrites "a"
strcpy (ptr, "ello"); // finishes writing hello and adds a NUL
The difference is that one is a pointer, the other is an array. You can, for instance, sizeof() array. You may be interested in peeking here
If you're using C++ as your tags indicate, you really should be using the C++ strings, not the C char arrays.
The string type makes manipulating strings a lot easier.
If you're stuck with char arrays for some reason, the line:
char new_str[] = "";
allocates 1 byte of space and puts a null terminator character into it. It's subtly different from:
char *new_str = "";
since that may give you a reference to non-writable memory. The statement:
char *new_str;
on its own gives you a pointer but nothing that it points to. It can also have a random value if it's local to a function.
What people tend to do (in C rather than C++) is to do something like:
char *new_str = malloc (100); // (remember that this has to be freed) or
char new_str[100];
to get enough space.
If you use the str... functions, you're basically responsible for ensuring that you have enough space in the char array, lest you get all sorts of weird and wonderful practice at debugging code. If you use real C++ strings, a lot of the grunt work is done for you.
The type of the first is char[1], the second is char *. Different types.
Allocate memory for the latter with malloc in C, or new in C++.
char foo[] = "Bar"; // Allocates 4 bytes and fills them with
// 'B', 'a', 'r', '\0'.
The size here is implied from the initializer string.
The contents of foo are mutable. You can change foo[i] for example where i = 0..3.
OTOH if you do:
char *foo = "Bar";
The compiler now allocates a static string "Bar" in readonly memory and cannot be modified.
foo[i] = 'X'; // is now undefined.
char new_str[]="abcd";
This specifies an array of characters (a string) of size 5 bytes (one byte for each character plus one for the null terminator). So it stores the string 'abcd' in memory and we can access this string using the variable new_str.
char *new_str="abcd";
This specifies a string 'abcd' is stored somewhere in the memory and the pointer new_str points to the first character of that string.
To differentiate them in the memory allocation side:
// With char array, "hello" is allocated on stack
char s[] = "hello";
// With char pointer, "hello" is stored in the read-only data segment in C++'s memory layout.
char *s = "hello";
// To allocate a string on heap, malloc 6 bytes, due to a NUL byte in the end
char *s = malloc(6);
s = "hello";
If you're in c++ why not use std::string for all your string needs? Especially anything dealing with concatenation. This will save you from a lot of problems.