C++ Concatenation Oops - c++

So I've been going back and forth from C++, C# and Java lately and well writing some C++ code I did something like this.
string LongString = "Long String";
char firstChar = LongString.at(0);
And then tried using a method that looks like this,
void MethodA(string str)
{
//some code
cout << str;
//some more code }
Here's how I implemented it.
MethodA("1. "+ firstChar );
though perfectly valid in C# and Java this did something weird in C++.
I expected something like
//1. L
but it gave me part of some other string literal later in the program.
what did I actually do?
I should note I've fixed the mistake so that it prints what I expect but I'm really interested in what I mistakenly did.
Thanks ahead of time.

C++ does not define addition on string literals as concatenation. Instead, a string literal decays to a pointer to its first element; a single character is interpreted as a numeric value so the result is a pointer offset from one location in the program's read-only memory segment to another.
To get addition as concatenation, use std::string:
MethodA(std::string() + "1. " + firstChar);

MethodA(std::string("1. ")+ firstChar );
since "1. " is const char[4] and has no concat methods)

The problem is that "1. " is a string literal (array of characters), that will decay into a pointer. The character itself is a char that can be promoted to an int, and addition of a const char* and an int is defined as calculating a new pointer by offsetting the original pointer by that many positions.
Your code in C++ is calling MethodA with the result of adding (int)firstChar (ASCII value of the character) to the string literal "1. ", which if the value of firstChar is greater than 4 (which it probably is) will be undefined behavior.

MethodA("1. "+ firstChar ); //your code
doesn't do what you want it to do. It is a pointer arithmetic : it just adds an integral value (which is firstChar) to the address of string-literal "1. ", then the result (which is of char const* type) is passed to the function, where it converts into string type. Based on the value of firstChar, it could invoked undefined behavior. In fact, in your case, it does invoke undefined behavior, because the resulting pointer points to beyond the string-literal.
Write this:
MethodA(string("1. ")+ firstChar ); //my code

String literals in C++ are not instances of std::string, but rather constant arrays of chars. So by adding a char to it an implicit cast to character pointer which is then incremented by the numerical value of the character, whick happened to point to another string literal stored in .data section.

Related

Why do I get a deprecated conversion warning with string literals in one case but not another?

I am learning C++. In the program shown here, as far as I know, str1 and str2 store the addresses of first characters of each of the relevant strings:
#include <iostream>
using namespace std;
int main()
{
char str1[]="hello";
char *str2="world";
cout<<str1<<endl;
cout<<str2<<endl;
}
However, str1is not giving any warnings, while with str2 I get this warning:
warning: deprecated conversion from string constant to 'char*' [-Wwrite-strings]
char *str2="world";
What's different between these two declarations that causes the warning in the second case but not the first?
When you write
char str1[] = "hello";
you are saying "please make me an array of chars that holds the string "hello", and please choose the size of the array str1 to be the size of the string initializing it." This means that str1 ends up storing its own unique copy of the string "hello". The ultimate type of str1 is char[6] - five for hello and one for the null terminator.
When you write
char *str2 = "world";
you are saying "please make me a pointer of type char * that points to the string literal "world"." The string literal "world" has type const char[6] - it's an array of six characters (five for hello and one for the null terminator), and importantly those characters are const and can't be modified. Since you're pointing at that array with a char * pointer, you're losing the const modifier, which means that you now (unsafely) have a non-const pointer to a const bit of data.
The reason that things are different here is that in the first case, you are getting a copy of the string "hello", so the fact that your array isn't const isn't a problem. In the second case, you are not getting a copy of "hello" and are instead getting a pointer to it, and since you're getting a pointer to it there's a concern that modifying it could be a real problem.
Stated differently, in the first case, you're getting an honest-to-goodness array of six characters that have a copy of hello in them, so there's no problem if you then decide to go and mutate those characters. In the second case, you're getting a pointer to an array of six characters that you're not supposed to modify, but you're using a pointer that permits you to mutate things.
So why is it that "world" is a const char[6]? As an optimization on many systems, the compiler will only put one copy of "world" into the program and have all copies of the literal "world" point to the exact same string in memory. This is great, as long as you don't change the contents of that string. The C++ language enforces this by saying that those characters are const, so mutating them leads to undefined behavior. On some systems, that undefined behavior leads to things like "whoa, my string literal has the wrong value in it!," and in others it might just segfault.
The problem is that you are trying to convert a string literal (with type const char*) to char*.

c++ When I cast a string to an int, the int is just a random number?

So I have a function which edits the values, but I cout the values in main to see that it outputs 1309668848 and changes every time I run the program. (this isn't happening in the preprocessor). I have been struggling with this for a while and decided to come here for advice.
Here's the function.
void GetDahInt() {
std::string NewValueS;
getline(std::cin, NewValueS);
NewValue = (int)NewValueS.c_str();
}
You can use
std::stoi( str )
Discards any whitespace characters (as identified by calling isspace()) until the first non-whitespace character is found, then takes as many characters as possible to form a valid base-n (where n=base) integer number representation and converts them to an integer value.
Source :Documentation of stoi
NewValueS.c_str() returns a pointer to array of characters.
You are casting the pointer to int (get the memory address of c_str()).
See: http://www.cplusplus.com/reference/string/string/c_str/
string::c_str
Get C string equivalent
Returns a pointer to an array that contains a null-terminated sequence of characters (i.e., a C-string) representing the current value of the string object.
This array includes the same sequence of characters that make up the value of the string object plus an additional terminating null-character ('\0') at the end.
NewValue = (int)NewValueS.c_str(); cast the address to string, and the address changes every time you execute your code.
You're looking for something like std::stoi (string to int), called like this:
std::string str = ...;
int value = std::stoi(str);
The pros and cons of this option compared to others such as std::stringstream are discussed here:
How to parse a string to an int in C++?
The reason your cast produces a "random" number is that c_str returns a pointer to the start of an array of char.
Casting from a const char* to an int is undefined behavior (these types may not even have the same size), but will likely produce a memory address that depends on where your std::string is allocated.

TCHAR pointer initialization

I am trying to understand the following code:
const TCHAR * portName = "COM15";
I understand that a TCHAR is either a Char (in ANSI) or a wChar (in Unicode), basically a 1 byte or 2 byte container that represents something.
Now, if I declare a pointer to a const TCHAR called portName, portName is then a pointer. When I use the "=" sign, I am giving that pointer a value, and it seems irrational to me that "COM15" would be the address. I assume that line of code is giving me a pointer to the location of the beginning of the "COM15" string of characters, correct?
So what is actually happening in that line of code?
Is a string of characters ("COM15") being created and the "=" sign actually means that the location of the beginning of that string is being given to portName?
"Is a string of characters ("COM15") being created and the "=" sign actually means that the location of the beginning of that string is being given to portName?"
Yes, exactly. But other than it sounds from your question as you might have expected, this happens when the program is compiled, and not at run time. Also the const keyword prohibits changing that pointer at runtime later.
This how C works:
When you say char * str1 in C, you are allocating a pointer in the memory. When you write str1 = "Hello";, you are creating a string literal in memory and making the pointer point to it.
When you create another string literal "new string" and assign it to str1, all you are doing is changing where the pointer points.

confusion about char pointer in c++

I'm new in c++ language and I am trying to understand the pointers concept.
I have a basic question regarding the char pointer,
What I know is that the pointer is a variable that stores an address value,
so when I write sth like this:
char * ptr = "hello";
From my basic knowledge, I think that after = there should be an address to be assigned to the pointer, but here we assign "hello" which is set of chars.
So what does that mean ?
Is the pointer ptr points to an address that stores "hello"? or does it store the hello itself?
Im so confused, hope you guys can help me..
Thanks in advance.
ptr holds the address to where the literal "hello" is stored at. In this case, it points to a string literal. It's an immutable array of characters located in static (most commonly read-only) memory.
You can make ptr point to something else by re-assigning it, but before you do, modifying the contents is illegal. (its type is actually const char*, the conversion to char* is deprecated (and even illegal in C++11) for C compatibility.
Because of this guarantee, the compiler is free to optimize for space, so
char * ptr = "hello";
char * ptr1 = "hello";
might yield two equal pointers. (i.e. ptr == ptr1)
The pointer is pointing to the address where "hello" is stored. More precisely it is pointing the 'h' in the "hello".
"hello" is a string literal: a static array of characters. Like all arrays, it can be converted to a pointer to its first element, if it's used in a context that requires a pointer.
However, the array is constant, so assigning it to char* (rather than const char*) is a very bad idea. You'll get undefined behaviour (typically an access violation) if you try to use that pointer to modify the string.
The compiler will "find somewhere" that it can put the string "hello", and the ptr will have the address of that "somewhere".
When you create a new char* by assigning it a string literal, what happens is char* gets assigned the address of the literal. So the actual value of char* might be 0x87F2F1A6 (some hex-address value). The char* points to the start (in this case the first char) of the string. In C and C++, all strings are terminated with a /0, this is how the system knows it has reached the end of the String.
char* text = "Hello!" can be thought of as the following:
At program start, you create an array of chars, 7 in length:
{'H','e','l','l','o','!','\0'}. The last one is the null character and shows that there aren't any more characters after it. [It's more efficient than keeping a count associated with the string... A count would take up perhaps 4 bytes for a 32-bit integer, while the null character is just a single byte, or two bytes if you're using Unicode strings. Plus it's less confusing to have a single array ending in the null character than to have to manage an array of characters and a counting variable at the same time.]
The difference between creating an array and making a string constant is that an array is editable and a string constant (or 'string literal') is not. Trying to set a value in a string literal causes problems: they are read-only.
Then, whenever you call the statement char* text = "Hello!", you take the address of that initial array and stick it into the variable text. Note that if you have something like this...
char* text1 = "Hello!";
char* text2 = "Hello!";
char* text3 = "Hello!";
...then it's quite possible that you're creating three separate arrays of {'H','e','l','l','o','!','\0'}, so it would be more efficient to do this...
char* _text = "Hello!";
char* text1 = _text;
char* text2 = _text;
char* text3 = _text;
Most compilers are smart enough to only initialize one string constant automatically, but some will only do that if you manually turn on certain optimization features.
Another note: from my experience, using delete [] on a pointer to a string literal doesn't cause issues, but it's unnecessary since as far as I know it doesn't actually delete it.

need help changing single character in char*

I'm getting back into c++ and have the hang of pointers and whatnot, however, I was hoping I could get some help understanding why this code segment gives a bus error.
char * str1 = "Hello World";
*str1 = '5';
ERROR: Bus error :(
And more generally, I am wondering how to change the value of a single character in a cstring. Because my understanding is that *str = '5' should change the value that str points to from 'H' to '5'. So if I were to print out str it would read: "5ello World".
In an attempt to understand I wrote this code snippet too, which works as expected;
char test2[] = "Hello World";
char *testpa2 = &test2[0];
*testpa2 = '5';
This gives the desired output. So then what is the difference between testpa2 and str1? Don't they both point to the start of a series of null-terminated characters?
When you say char *str = "Hello World"; you are making a pointer to a literal string which is not changeable. It should be required to assign the literal to a const char* instead, but for historical reasons this is not the case (oops).
When you say char str[] = "Hello World;" you are making an array which is initialized to (and sized by) a string known at compile time. This is OK to modify.
Not so simple. :-)
The first one creates a pointer to the given string literal, which is allowed to be placed in read-only memory.
The second one creates an array (on the stack, usually, and thus read-write) that is initialised to the contents of the given string literal.
In the first example you try to modify a string literal, this results in undefined behavior.
As per the language standard in 2.13.4.2
Whether all string literals are
distinct (that is, are stored in
nonoverlapping objects) is
implementation-defined. The effect of
attempting to modify a string literal
is undefined.
In your second example you used string-literal initialization, defined in 8.5.2.1
A char array (whether plain char,
signed char, or unsigned char) can be
initialized by a string- literal
(optionally enclosed in braces); a
wchar_t array can be initialized by a
wide string-literal (option- ally
enclosed in braces); successive
characters of the string-literal
initialize the members of the
array.