I'm just curious, I want to know what's going on here:
class Test
{
char * name;
public:
Test(char * c) : name(c){}
};
1) Why won't Test(const char * c) : name(c){} work? Because char * name isn't const? But what about this:
main(){
char * name = "Peter";
}
name is char*, but "Peter" is const char*, right? So how does that initialization work?
2) Test(char * c) : name(c){ c[0] = 'a'; } - this crashes the program. Why?
Sorry for my ignorance.
Why won't Test(const char * c) : name(c) {} work? Because char * name isn't const?
Correct.
how does this initialization work: char * name = "Peter";
A C++ string literal is of type char const[] (see here, as opposed to just char[] in C, as it didn't have the const keyword1). This assignment is considered deprecated in C++, yet it is still allowed2 for backward compatibility with C.
Test(char * c) : name(c) { c[0] = 'a'; } crashes the program. Why?
What are you passing to Test when initializing it? If you're passing a string literal or an illegal pointer, doing c[0] = 'a' is not allowed.
1 The old version of the C programming language (as described in the K&R book published in 1978) did not include the const keyword. Since then, ANSI C borrowed the idea of const from C++.
2 Valid in C++03, no longer valid in C++11.
A conversion to const is a one-way street, so to speak.
You can convert from T * to T const * implicitly.
Conversion from T const * to T * requires an explicit cast. Even if you started from T *, then converted to T const *, converting back to T * requires an explicit cast, even though it's really just "restoring" the access you had to start with.
Note that throughout, T const * and const T * are precisely equivalent, and T stands for "some arbitrary type" (char in your example, but could just as easily be something else like int or my_user_defined_type).
Initializing a char * from a string literal (e.g., char *s = "whatever";) is allowed even though it violates this general rule (the literal itself is basically const, but you're creating a non-const pointer to it). This is simply because there's lots of code that depends on doing this, and nobody's been willing to break that code, so they have a rule to allow it. That rule's been deprecated, though, so at least in theory some future compiler could reject code that depends on it.
Since the string literal itself is basically const, any attempt at modifying it results in undefined behavior. On most modern systems, this will result in the process being terminated, because the memory storing the string literal will be marked at 'read only'. That's not the only possible result. Just for example, back in the days of MS-DOS, it would often succeed. It could still have bizarre side-effects though. For one example, many compilers "knew" that string literals were supposed to be read-only, so they'd "merge" identical string literals. Therefore if you had something like:
char *a = "Peter"; a[1] = 'a';
char *b = "Peter";
cout << b;
The compiler would have "merged" a and b to actually point at the same memory -- so when you modified a, that change would also affect b, so it would print out "Pater" instead of "Peter".
Note that the string literals didn't need to be entirely identical for this to happen either. As long as one was identical to the end of another, they could be merged:
char *a = "this?";
char *b = "What's this?";
a[2] = 'a';
a[3] = 't';
cout << b; // could print "What's that?"
Mandating one behavior didn't make sense, so the result was (and is) simply undefined.
First of all this is C++, you have std::string. You should really consider using it.
Regarding your question, "Peter" is a char literal, hence it is unmodifiable and surely you can't write on it. You can:
have a const char * member variable and initialize it like you are doing name(c), by declaring "Peter" as const
have a char * member variable and copy the content, eg name(strdup(c)) (and remember to release it in destructor.
Correct.
"Peter" is typically stored in a read-only memory location (actually, it depends on what type of device we are on) because it is a string literal. It is undefined what happens when you attempt to modify a string literal (but you can probably guess that you shouldn't).
You should use std::string anyways.
1a) Right
1b) "Peter" is not const char*, its is char* but it may not be modified. The reason is for compatibility with times before const existed in the language. A lot of code already existed that said char* p = "fred"; and they couldn't just make that code illegal overnight.
2) Can't say why that would crash the program without seeing how you are using that constructor.
Related
I have a code like this but I keep receiving this error :
A value of type "const char*" cannot be used to initialize an entity of type "char *"
What is going on?
I have read up on the following threads but have not been able to see any result to my answer as all of them are either from char to char* or char* to char:
Value type const char cannot be used to initialize an entity of type char*
Value of type char* cannot be used to initialize an entity of type "char"
#include <iostream>;
using namespace std;
int main() {
int x = 0; //variable x created
int cars (14);//cars is created as a variable with value 14
int debt{ -1000 };//debt created with value 1000
float cash = 2.32;
double credit = 32.32;
char a = 'a';//for char you must use a single quote and not double
char* sandwich = "ham";
return 0;
}
I am using Visual Studio Community 2017
That is correct. Let’s say you had the following code:
const char hello[] = "hello, world!";
char* jello = hello; // Not allowed, because:
jello[0] = 'J'; // Undefined behavior!
Whoops! A const char* is a non-const pointer to const char. If you assign its value to a non-const char*, you’ve lost its const property.
A const pointer to non-const char would be a char* const, and you can initialize a char* from that all day if you want.
You can, if you really want, achieve this with const_cast<char*>(p), and I occasionally have, but it’s usually a sign of a serious design flaw. If you actually get the compiler to emit instructions to write to the memory aliased by a string constant, you get undefined behavior. One of the many things that might go wrong is that some implementations will store the constant in read-only memory and crash. Or the same bytes of memory might be re-used for more than one purpose, because after all, we warned you never to change it.
By the way, the rules in C are different. This is solely for backward-compatibility with early versions of C that did not have the const keyword, and you should never write new code that uses a non-const alias to a string constant.
You need to make your string literal type const because in C++ it is a constant array of char, unlike C where it is just an array of char. You cannot change a string literal, so making it const is preferred in C++ for extra safety. It is the same reason you have to use an explicit cast when going from const char* to char*. It's still technically "allowed" in C++ since it is allowed in C, which is why it's just a warning. It's still bad practice to do so. To fix the warning, make it const.
const char* sandwich = "ham";
Your code (and underlying assumption) is valid pre C++11 standard.
String literals (e.g. "ham") since C++11 are of type const char* (or const char[]) if you will instead of char * they used to be. [Always read specs for breaking changes!!!]
Hence the warning in VS 2017. Change the compiler version to pre C++11 version and you will be amazed.
This has subtle nuances and can cause frustrating debug sessions
// C++11 or later
auto c = "Rowdie";
// c has type const char*, can't use c to modify literal
c[0] = 'H'; // illegal - CTE
// -vs-
char * d = "Rowdie";
d[0] = 'H';
cout << d; // outputs "Howdie"
Also another example is auto return type from functions
auto get_literal() {
// ... function code
return "String Literal";
}
// and using value later
char* lit = get_literal(); // You get same error as const char* cannot be init to char*
im realy confused about const char * and char *.
I know in char * when we want to modify the content, we need to do something like this
const char * temp = "Hello world";
char * str = new char[strlen(temp) + 1];
memcpy(str, temp, strlen(temp));
str[strlen(temp) + 1] = '\0';
and if we want to use something like this
char * str = "xxx";
char * str2 = "xts";
str = str2;
we get compiler warning. it's ok I know when i want to change char * I have to use something memory copy. but about const char * im realy confused. in const char * I can use this
const char * str = "Hello";
const char * str2 = "World";
str = str2; // and now str is Hello
and I have no compiler error ! why ? why we use memory copy when is not const and in const we only use equal operator ! and done !... how possible? is it ok to just use equal in const? no problem happen later?
As other answers say, you should distinguish pointers and bytes they point to.
Both types of pointers, char * and const char *, can be changed, that is, "redirected" to point to different bytes. However, if you want to change the bytes (characters) of the strings, you cannot use const char *.
So, if you have string literals "Hello" and "World" in your program, you can assign them to pointers, and printing the pointer will print the corresponding literal. However, to do anything non-trivial (e.g. change Hello to HELLO), you will need non-const pointers.
Another example: with some pointer manipulation, you can remove leading bytes from a string literal:
const char* str = "Hello";
std::cout << str; // Hello
str = str + 2;
std::cout << str; // llo
However, if you want to extract a substring, or do any other transformation on a string, you should reallocate it, and for that you need a non-const pointer.
BTW since you are using C++, you can use std::string, which makes it easier to work with strings. It reallocates strings without your intervention:
#include <string>
std::string str("Hello");
str = str.substr(1, 3);
std::cout << str; // ell
This is a confusing hangover from the days of early C. Early C didn't have const, so string literals were "char *". They remained char * to avoid breaking old code, but they became non-modifiable, so const char * in all but name. So modern C++ either warns or gives an error (to be strictly conforming) when the const is omitted.
Your memcpy missed the trailing nul byte, incidentally. Use strcpy() to copy a string, that's the right function with the right name. You can create a string in read/write memory by use of the
char rwstring[] = "I am writeable";
syntax.
That is cause your variables are just a pointers *. You're not modifiying their contents, but where they are pointing to.
char * a = "asd";
char * b = "qwe";
a = b;
now you threw away the contents of a. Now a and b points to the same place. If you modify one, both are modified.
In other words. Pointers are never constants (mostly). your const predicate in a pointer variable does not means nothing to the pointer.
The real difference is that the pointer (that is not const) is pointing to a const variable. and when you change the pointer it will be point to ANOTHER NEW const variable. That is why const has no effect on simple pointers.
Note: You can achieve different behaviours with pointers and const with more complex scenario. But with simple as it, it mostly has no effect.
Citing Malcolm McLean:
This is a confusing hangover from the days of early C. Early C didn't have const, so string literals were "char *". They remained char * to avoid breaking old code, but they became non-modifiable, so const char * in all but name.
Actually, string literals are not pointers, but arrays, this is why sizeof("hello world") works as a charm (yields 12, the terminating null character is included, in contrast to strlen...). Apart from this small detail, above statement is correct for good old C even in these days.
In C++, though, string literals have been arrays of constant characters (char const[]) right from the start:
C++ standard, 5.13.5.8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration.
(Emphasised by me.) In general, you are not allowed to assign pointer to const to pointer to non-const:
char const* s = "hello";
char ss = s;
This will fail to compile. Assigning string literals to pointer to non-const should normally fail, too, as the standard explicitly states in C.1.1, subclause 5.13.5:
Change: String literals made const.
The type of a string literal is changed from “array of char” to “array of const char”.
[...]char* p = "abc"; // valid in C, invalid in C++
Still, string literal assignement to pointer to non-const is commonly accepted by compilers (as an extension!), probably to retain compatibility to C. As this is, according to the standard, invalid, the compiler yields a warning, at least...
I need to clarify my concepts regarding the basics of pointer initialization in C++. As per my understanding, a pointer must be assigned an address before putting some value using the pointer.
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This would probably show the correct output (10) but this may cause issue in larger programs since p initially had garbage address which can be anything & may later be used somewhere else in the program as well.So , I believe this is incorrrect, the correct way is:
int *p;
int x=10;
p=&x; //appropriate
cout << *p <<"\n";
My question is, if the above understanding is correct, then does the same apply on char* as well?:
const char *str="hello"; // inappropriate
cout << str << "\n";
//OR
const string str1= "hello";
const char str2[6] ="world";
const char *str=str1; //appropriate
const char *st=str2; //appropriate
cout << str << st << "\n";
Please advice
Your understanding of strings is incorrect.
Lets take for example the very first line:
const char *str="hello";
This is actually correct. A string literal like "hello" is turned into a constant array by the compiler, and like all arrays it can decay to a pointer to its first element. So what you are doing is making str point to the first character of the array.
Then lets continue with
const string str1= "hello";
const char *str=str1;
This is actually wrong. A std::string object have no casting operator defined to cast to a const char *. The compiler will give you an error for this. You need to use the c_str function go get a pointer to the contained string.
Lastly:
const char str2[6] ="world";
const char *st=str2; //appropriate
This is really no different than the first line when you declare and initialize str. This is, as you say, "appropriate".
About that first example with the "inappropriate" pointer:
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This is not only "inappropriate", this leads to undefined behavior and may actually crash your program. Also, the correct term is that the value of p is indeterminate.
When I declare a pointer
int *p;
I get an object p whose values are addresses. No ints are created anywhere. The thing you need to do is think of p as being an address rather than being an int.
At this point, this isn't particularly useful since you have no addresses you could assign to it other than nullptr. Well, technically that's not true: p itself has an address which you can get with &p and store it in an int**, or even do something horrible like p = reinterpret_cast<int*>(&p);, but let's ignore that.
To do something with ints, you need to create one. e.g. if you go on to declare
int x;
you now have an int object whose values are integers, and we could then assign its address to p with p = &x;, and then recover the object from p via *p.
Now, C style strings have weird semantics — the weirdest aspect being that C doesn't actually have strings at all: it's always working with arrays of char.
String literals, like "Hello!", are guaranteed to (act1 like they) exist as an array of const char located at some address, and by C's odd conversion rules, this array automatically converts to a pointer to its first element. Thus,
const char *str = "hello";
stores the address of the h character in that character array. The declaration
const char str2[6] ="world";
works differently; this (acts1 like it) creates a brand new array, and copies the contents of the string literal "world" into the new array.
As an aside, there is an obsolete and deprecated feature here for compatibility with legacy programs, but for some misguided reason people still use it in new programs these days so you should be aware of it and that it's 'wrong': you're allowed to break the type system and actually write
char *str = "hello";
This shouldn't work because "hello" is an array of const char, but the standard permits this specific usage. You're still not actually allowed to modify the contents of the array, however.
1: By the "as if" rule, the program only has to behave as if things happen as I describe, but if you peeked at the assembly code, the actual way things happen can be very different.
Maybe a simple issue, but I've gotten confused between "byte arrays", pointers and casts in c++.
Take a look at the following and let me know what I need to read about to fix it, as well as the fix. It relates to the utf8proc library.
const unsigned char *aa = (const unsigned char*)e.c_str();
utf8proc_uint8_t* a = utf8proc_NFC(aa);
char b = (char)a;
string d = string(b);
It is bad enough no need for an error message here, but there is no constructor string on the string(b) line.
There appear to be a couple problems here. The biggest is the assignment:
char b = (char)a;
What you are doing is telling the compiler to take the pointer (memory location) and convert that to a char, then assign it to single char value b. So you'll basically have random jibberish in b.
Instead, if you want to treat a like a basic char*, you would write:
char* b = (char*)a;
Then you could use the string class with either:
string d = string(b);
or you could skip several line by the direct conversion:
string d = string((char*)a);
You are also looking for a headache down the line if you don't delete the conversion value returned by the utf8proc_NFC() call, and if you don't do an error check after the conversion.
Plus I'll put in a plug for using some Hungarian notation to distinguish a pointer (a 'p' prefix on variables). This makes it obvious that you can do things like:
char tmp = *pStr; // a single character (first in the string)
char tmp2 = pStr[1]; // a single character (second in the string)
char* pTmp = pStr; // a pointer to a null terminated string
But you would never see:
char tmp3 = (char)pStr; // compiles, but makes no sense to treat pointer as a character.
So here is a clean version of all of this:
utf8proc_uint8_t* pUTF = utf8proc_NFC( (const unsigned char*)e.c_str() );
string strUTF;
if (pUTF)
{
strUTF = (char*)pUTF;
free pUTF;
}
This code is almost certainly not what you want, since it is casting a pointer into a scalar.
utf8proc_uint8_t* a = ...;
char b = (char)a;
Instead, you want to cast and produce a pointer:
utf8proc_uint8_t* a = ...;
const char *b = (const char *)a;
I also added const, which is not strictly necessary but a good idea to use wherever you can.
Been thinking, what's the difference between declaring a variable with [] or * ? The way I see it:
char *str = new char[100];
char str2[] = "Hi world!";
.. should be the main difference, though Im unsure if you can do something like
char *str = "Hi all";
.. since the pointer should the reference to a static member, which I don't know if it can?
Anyways, what's really bugging me is knowing the difference between:
void upperCaseString(char *_str) {};
void upperCaseString(char _str[]) {};
So, would be much appreciated if anyone could tell me the difference? I have a hunch that both might be compiled down the same, except in some special cases?
Ty
Let's look into it (for the following, note char const and const char are the same in C++):
String literals and char *
"hello" is an array of 6 const characters: char const[6]. As every array, it can convert implicitly to a pointer to its first element: char const * s = "hello"; For compatibility with C code, C++ allows one other conversion, which would be otherwise ill-formed: char * s = "hello"; it removes the const!. This is an exception, to allow that C-ish code to compile, but it is deprecated to make a char * point to a string literal. So what do we have for char * s = "foo"; ?
"foo" -> array-to-pointer -> char const* -> qualification-conversion -> char *. A string literal is read-only, and won't be allocated on the stack. You can freely make a pointer point to them, and return that one from a function, without crashing :).
Initialization of an array using a String literal
Now, what is char s[] = "hello"; ? It's a whole other thing. That will create an array of characters, and fill it with the String "hello". The literal isn't pointed to. Instead it is copied to the character-array. And the array is created on the stack. You cannot validly return a pointer to it from a function.
Array Parameter types.
How can you make your function accept an array as parameter? You just declare your parameter to be an array:
void accept_array(char foo[]);
but you omit the size. Actually, any size would do it, as it is just ignored: The Standard says that parameters declared in that way will be transformed to be the same as
void accept_array(char * foo);
Excursion: Multi Dimensional Arrays
Substitute char by any type, including arrays itself:
void accept_array(char foo[][10]);
accepts a two-dimensional array, whose last dimension has size 10. The first element of a multi-dimensional array is its first sub-array of the next dimension! Now, let's transform it. It will be a pointer to its first element again. So, actually it will accept a pointer to an array of 10 chars: (remove the [] in head, and then just make a pointer to the type you see in your head then):
void accept_array(char (*foo)[10]);
As arrays implicitly convert to a pointer to their first element, you can just pass an two-dimensional array in it (whose last dimension size is 10), and it will work. Indeed, that's the case for any n-dimensional array, including the special-case of n = 1;
Conclusion
void upperCaseString(char *_str) {};
and
void upperCaseString(char _str[]) {};
are the same, as the first is just a pointer to char. But note if you want to pass a String-literal to that (say it doesn't change its argument), then you should change the parameter to char const* _str so you don't do deprecated things.
The three different declarations let the pointer point to different memory segments:
char* str = new char[100];
lets str point to the heap.
char str2[] = "Hi world!";
puts the string on the stack.
char* str3 = "Hi world!";
points to the data segment.
The two declarations
void upperCaseString(char *_str) {};
void upperCaseString(char _str[]) {};
are equal, the compiler complains about the function already having a body when you try to declare them in the same scope.
Okay, I had left two negative comments. That's not really useful; I've removed them.
The following code initializes a char pointer, pointing to the start of a dynamically allocated memory portion (in the heap.)
char *str = new char[100];
This block can be freed using delete [].
The following code creates a char array in the stack, initialized to the value specified by a string literal.
char [] str2 = "Hi world!";
This array can be modified without problems, which is nice. So
str2[0] = 'N';
cout << str2;
should print Ni world! to the standard output, making certain knights feel very uncomfortable.
The following code creates a char pointer in the stack, pointing to a string literal... The pointer can be reassigned without problems, but the pointed block cannot be modified (this is undefined behavior; it segfaults under Linux, for example.)
char *str = "Hi all";
str[0] = 'N'; // ERROR!
The following two declarations
void upperCaseString(char *_str) {};
void upperCaseString(char [] _str) {};
look the same to me, and in your case (you want to uppercase a string in place) it really doesn't matters.
However, all this begs the question: why are you using char * to express strings in C++?
As a supplement to the answers already given, you should read through the C FAQ regarding arrays vs. pointers. Yes it's a C FAQ and not a C++ FAQ, but there's little substantial difference between the two languages in this area.
Also, as a side note, avoid naming your variables with a leading underscore. That's reserved for symbols defined by the compiler and standard library.
Please also take a look at the http://c-faq.com/aryptr/aryptr2.html The C-FAQ might prove to be an interesting read in itself.
The first option dynamically allocates 100 bytes.
The second option statically allocates 10 bytes (9 for the string + nul character).
Your third example shouldn't work - you're trying to statically-fill a dynamic item.
As to the upperCaseString() question, once the C-string has been allocated and defined, you can iterate through it either by array indexing or by pointer notation, because an array is really just a convenient way to wrap pointer arithmetic in C.
(That's the simple answer - I expect someone else will have the authoritative, complicated answer out of the spec :))