C++, strings, and pointers - c++

I know this is rudimentary but I know nothing about C++. Is it necessary to do:
string *str = getNextRequest();
rather than
string str = getNextRequest();
in order to reference str later on in the same block of code? If so, what type of error would the latter produce?

That depends entirely on the return type of getNextRequest.
Strings can be used and reused throughout the scope they're declared in. They essentially contain a mutable C string and some handling information, helper methods, etc.
You can, very safely, return a string from a function and the compiler will make a copy or move it as necessary. That string (str here) can then be used normally, with no worries about using out-of-scope locals or such.
There are times when a pointer to a string is needed, but unless you're using it as an out parameter, those tend to be rare and indicate some design oddity.

Which you use depends on what getNextRequest() returns. If it returns a string *, then use the first line, if it returns string then use the second.
So if the declaration of getNextRequest is like this:
string getNextRequest();
Then
string str = getNextRequest();
is correct. If the declaration is like this:
string *getNextRequest();
Then you can go with
string *str = getNextRequest();
or
string str = *getNextRequest();

string str = getNextRequest();
will create a copy of the string returned by getNextRequest. If you want to alter the contents of str and wish that these changes are also within the string returned by getNextRequest you have to return a pointer or reference.
If this is what you want, then you should define getNextRequest as:
string& getNextRequest()
and use it like:
string& str = getNextRequest();

string str* = getNextRequest();
As noted by #dasblinkenlight, that would be a syntax error
But to answer your original question, is it necessary? No. In general, you should not use pointers unless you must.
Especially with the STL. The STL is not designed to be used with pointers--it does dynamic memory management for you. Unless you have a good reason, you should always use vector<int> v and string s rather than vector<int>* or string*.

You will probably need to provide a little bit more information regarding this function getNextRequest(). Where is it from? Library? API? Purpose?
If the return type of the function is a string* (pointer to str), then the string has been allocated to the "heap". This means, it does not matter which block of code you reference the string from. As long as you maintain the pointer, you will be able to access it.
If the return type of the function is simply a string (meaning not a pointer), it will return the value, not the address of str. In essence, you would be "copying" the string to your new variable. In this case, the variable would be allocated on the stack, and you would only be able to reference it when in the scope of the code block.

Related

"Use & to create a pointer to a member" when using c_str

I'm trying to add a char* to a vector of char*'s by casting it over from a string. Here's the code I'm using:
vector<char*> actionLog;
// lots of code
int value = ...
// lots of code
string str = "string";
cout << str << value << endl;
str += std::to_string(player->scrap);
actionLog.push_back(str.c_str());
The problem is that I get the specified "Use & to create a pointer to a member" error for the push_back line. str.c_str should return a char*, which is the type that actionLog uses. I'm either incorrect about how c_str works, or doing something else wrong. Pushing to actionLog with
actionLog.push_back("something");
Works fine, but not what I mentioned. What am I doing wrong?
EDIT: I was actually using c_str() as a function, I just copied it incorrectly
There are actually several problems with what you are trying to do.
Firstly, c_str is a member function. You would have to call it, using (): str.c_str().
c_str() returns a const char*, so you won't be able to store it in a vector<char*>. This is so you can't break the std::string by changing it's internals in ways it doesn't expect.
You really should not store the result of c_str(). It only remains valid until you do some non-const operation on the std::string it came from. I.e. if you make a change to the content of the std::string, then try to use the corresponding element in the vector, you have Undefined Behaviour! And from the way you have laid out your example, it looks like the lifetime of the string will be much shorter than the lifetime of the vector, so the vector would point to something that doesn't even exist any more.
Maybe it's better to just use a std::vector<std::string>. If you don't need the original string again after this, you could even std::move it into the vector, and avoid extra copying.
As an aside, please reconsider your use of what are often considered bad practices: using namespace std; and endl (those are links to explanations). The latter is a bit contentious, but at least understand why and make an informed decision.
std::basic_string::c_str() is a member function, not a data member - you need to invoke it by using ().
The correct code is:
actionLog.push_back(str.c_str());
Note that std::basic_string::c_str() returns a pointer to const char - your actionLog vectors should be of type std::vector<const char*>.
Vittorio's answer tells you what you did wrong in the details. But I would argue that what you're doing wrong is really using a vector<char*> instead of a vector<string> in the first place.
With the vector of pointers, you have to worry about lifetime of the underlying strings, about them getting invalidated, changed from under you, and so on. The name actionLog suggests that the thing is long-lived, and the code you use to add to it suggests that str is a local helper variable used to build the log string, and nothing else. The moment str goes out of scope, the vector contains a dangling pointer.
Change the vector to a vector<string>, do a actionLog.push_back(str), and don't worry about lifetimes or invalidation.
You forgot (), c_str is a method, not a data member. Just write actionLog.push_back(str.c_str());

Pointers in C++ - A beginner's dilemma

Pointers have always made me blank about the logic I intend to use in code, If someone can help me understand a few concepts that would be really helpful. Here's a code snippet from my program,
vector <char> st;
char *formatForHtml(string str, string htmlTag)
{
string strBegin;
strBegin = "<";
strBegin.append(htmlTag);
strBegin.append(">");
strBegin.append(str);
string strEnd = "</";
strEnd.append(htmlTag);
strEnd.append(">");
strBegin.append(strEnd);
st.resize(strBegin.size());
for (int i =0;i <strBegin.size();i++) {
st[i] = strBegin.at(i);
}
return &st[0];
}
In the code above if I have to return address of st[0], I have to write the function of type char *. I need to know the reason to do so, also if address is the integer value why can I not define function as an int type?
P.S. It's a beginner level doubt.
You don't tell us what st is, so we can't tell whether the code is
totally incorrect, or just bad design. If st is a typo for str
(just guessing, since str isn't used anywhere), then you have
undefined behavior, and the program is incorrect, since you're returning
a pointer into an object which has been destructed. At any rate, a
function like formatForHtml should return an std::string, as a
value, and not a pointer (and certainly not a pointer to a non-const).
I might add that you don't use a loop to copy string values character by
character. std::string acts as a normal value type, so you can just
assign: st = strBegin;.
EDIT:
Just for the record, I've since reexamined your code. The natural way
of writing it would be:
std::string
formatForHtml( std::string const& cdata, std::string const& tag )
{
return '<' + tag + '>' + cdata + "</" + tag + '>';
}
No pointers (at least not visible---in fact, "+ operator), and full use of
std::strings facilities.
I have to write the function of type 'char *' I need to know the reason to do so
There's no reason to do so. Just return a std::string.
From the looks of your code, st is a std::vector<char>. Since std::vector has continuous memory, &st[0] is the address of the first char in the vector, and thus points to the not null-terminated string that the vector represents, which isn't very helpful outside the function because you can't find the length.
That's awkward code, you're expecting 'str' as an argument, but then you return its memory location? Why not return the string itself?
Your code there has a few typos. (Did you mean 'str' when you typed 'st'? If not, then where is 'st' defined?)
As for why you can't define the function as "int" type, that is because an "int" is a different type to "pointer to char".
Personally, I would return 'string', and just have a 'return str' at the end there.
Pointer is pointer, not integer value. Code is right if and only if st is global variable or declared as static in this function. But return string is preferably, than return pointer.
Why don't you use string as return type for your function?
First: a pointer is an adress of something lying in memory, so you can't replace it with an integer (only with some dirty tricks).
Second: &st[0] is the adress of the internal buffer the string uses to store its content. When st goes out of scope (or is reused for the next call), it will be overwritten or given back to the heap manager, so the one calling this function will end with some garbage.
Btw, most of your function does unnessesary work which the string class can do by itself (for example copiing strBegin to st).
Simple.
string str
is something that contains a bunch characters.
str[0]
return the first character.
Therefore,
&str[0]
gives you the address of the first character, therefore it's a "char *" --- "char" being whatever the pointer points to, "*" being the "address of" part of the pointer. The type of the pointer is about the object being pointed to.
Also, addresses may be just "integers" on the hardware level (everything is an integer... except for integers themselves, they are a bunch of bools), but that doesn't mean anything for a higher level language. For a higher level language, a pointer is a pointer, not an integer.
Treating one as another is almost always an error, and C++11 even added the nullptr keyword so we don't have to rely on the one valid "pointer is an integer" case.
And btw, your code won't work. "str" only exists for the duration of the function, so when the function returns "str" ceases to exist, which means the pointer you have returned points into a big black nowhere.

C++ const cast, unsure if this is secure

It maybe seems to be a silly question but i really need to clarify this:
Will this bring any danger to my program?
Is the const_cast even needed?
If i change the input pointers values in place will it work safely with std::string or will it create undefined behaviour?
So far the only concern is that this could affect the string "some_text" whenever I modify the input pointer and makes it unusable.
std::string some_text = "Text with some input";
char * input = const_cast<char*>(some_text.c_str());
Thanks for giving me some hints, i would like to avoid the shoot in my own foot
As an example of evil behavior: the interaction with gcc's Copy On Write implementation.
#include <string>
#include <iostream>
int main() {
std::string const original = "Hello, World!";
std::string copy = original;
char* c = const_cast<char*>(copy.c_str());
c[0] = 'J';
std::cout << original << "\n";
}
In action at ideone.
Jello, World!
The issue ? As the name implies, gcc's implementation of std::string uses a ref-counted shared buffer under the cover. When a string is modified, the implementation will neatly check if the buffer is shared at the moment, and if it is, copy it before modifying it, ensuring that other strings sharing this buffer are not affected by the new write (thus the name, copy on write).
Now, with your evil program, you access the shared buffer via a const-method (promising not to modify anything), but you do modify it!
Note that with MSVC's implementation, which does not use Copy On Write, the behavior would be different ("Hello, World!" would be correctly printed).
This is exactly the essence of Undefined Behavior.
To modify an inherently const object by casting away its constness using const_cast is an Undefined Behavior.
string::c_str() returns a const char *, i.e: a pointer to a constant c-style string. Technically, modifying this will result in Undefined Behavior.
Note, that the use of const_cast is when you have a const pointer to a non const data and you wish to modify the non-constant data.
Simply casting will not bring forth an undefined behavior. Modifying the data pointed at, however, will. (Also see ISO 14882:98 5.2.7-7).
If you want a pointer to modifiable data, you can have a
std::vector<char> wtf(str.begin(), str.end());
char* lol= &wtf[0];
The std::string manages it's own memory internally, which is why it returns a pointer to that memory directly as it does with the c_str() function. It makes sure it's constant so that your compiler will warn you if you try to do modifiy it.
Using const_cast in that way literally casts away such safety and is only an arguably acceptable practice if you are absolutely sure that memory will not be modified.
If you can't guarantee this then you must copy the string and use the copy.; it's certainly a lot safer to do this in any event (you can use strcpy).
See the C++ reference website:
const char* c_str ( ) const;
"Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
A terminating null character is automatically appended.
The returned array points to an internal location with the required storage space for this sequence of characters plus its terminating null-character, but the values in this array should not be modified in the program and are only guaranteed to remain unchanged until the next call to a non-constant member function of the string object."
Yes, it will bring danger, because
input points to whatever c_str happens to be right now, but if some_text ever changes or goes away, you'll be left with a pointer that points to garbage. The value of c_str is guaranteed to be valid only as long as the string doesn't change. And even, formally, only if you don't call c_str() on other strings too.
Why do you need to cast away the const? You're not planning on writing to *input, are you? That is a no-no!
This is a very bad thing to do. Check out what std::string::c_str() does and agree with me.
Second, consider why you want a non-const access to the internals of the std::string. Apparently you want to modify the contents, because otherwise you would use a const char pointer. Also you are concerned that you don't want to change the original string. Why not write
std::string input( some_text );
Then you have a std::string that you can mess with without affecting the original, and you have std::string functionality instead of having to work with a raw C++ pointer...
Another spin on this is that it makes code extremely difficult to maintain. Case in point: a few years ago I had to refactor some code containing long functions. The author had written the function signatures to accept const parameters but then was const_casting them within the function to remove the constness. This broke the implied guarantee given by the function and made it very difficult to know whether the parameter has changed or not within the rest of the body of the code.
In short, if you have control over the string and you think you'll need to change it, make it non-const in the first place. If you don't then you'll have to take a copy and work with that.
it is UB.
For example, you can do something like this this:
size_t const size = (sizeof(int) == 4 ? 1024 : 2048);
int arr[size];
without any cast and the comiler will not report an error. But this code is illegal.
The morale is that you need consider action each time.

C++ - char* vs. string*

If I have a pointer that points to a string variable array of chars, is there a difference between typing:
char *name = "name";
And,
string name = "name";
Yes, there’s a difference. Mainly because you can modify your string but you cannot modify your first version – but the C++ compiler won’t even warn you that this is forbidden if you try.
So always use the second version.
If you need to use a char pointer for whatever reason, make it const:
char const* str = "name";
Now, if you try to modify the contents of str, the compiler will forbid this (correctly). You should also push the warning level of your compiler up a notch: then it will warn that your first code (i.e. char* str = "name") is legal but deprecated.
For starters, you probably want to change
string *name = "name";
to read
string name = "name";
The first version won't compile, because a string* and a char* are fundamentally different types.
The difference between a string and a char* is that the char* is just a pointer to the sequence. This approach of manipulating strings is based on the C programming language and is the native way in which strings are encoded in C++. C strings are a bit tricky to work with - you need to be sure to allocate space for them properly, to avoid walking off the end of the buffer they occupy, to put them in mutable memory to avoid segmentation faults, etc. The main functions for manipulating them are in <cstring>. Most C++ programmers advise against the use of C-style strings, as they are inherently harder to work with, but they are still supported both for backwards compatibility and as a "lowest common denominator" to which low-level APIs can build off of.
A C++-style string is an object encapsulating a string. The details of its memory management are not visible to the user (though you can be guaranteed that all the memory is contiguous). It uses operator overloading to make some common operations like concatenation easier to use, and also supports several member functions designed to do high-level operations like searching, replacing, substrings, etc. They also are designed to interoperate with the STL algorithms, though C-style strings can do this as well.
In short, as a C++ programmer you are probably better off using the string type. It's safer and a bit easier to use. It's still good to know about C-style strings because you will certainly encounter them in your programming career, but it's probably best not to use them in your programs where string can also be used unless there's a compelling reason to do so.
Yes, the second one isn't valid C++! (It won't compile).
You can create a string in many ways, but one way is as follows:
string name = "name";
Note that there's no need for the *, as we don't need to declare it as a pointer.
char* name = "name" should be invalid but compiles on most systems for backward compatibility to the old days when there was no const and that it would break large amounts of legacy code if it did not compile. It usually gets a warning though.
The danger is that you get a pointer to writable data (writable according to the rules of C++) but if you actually tried writing to it you would invoke Undefined Behaviour, and the language rules should attempt to protect you from that as much as is reasonably possible.
The correct construct is
const char * name = "name";
There is nothing wrong with the above, even in C++. Using string is not always more correct.
Your second statement should really be
std::string name = "name";
string is a class (actually a typedef of basic_string<char,char_traits<char>,allocator<char>) defined in the standard library therefore in namespace std (as are basic_string, char_traits and allocator)
There are various scenarios where using string is far preferable to using arrays of char. In your immediate case, for example, you CAN modify it. So
name[0] = 'N';
(convert the first letter to upper-case) is valid with string and not with the char* (undefined behaviour) or const char * (won't compile). You would be allowed to modify the string if you had char name[] = "name";
However if want to append a character to the string, the std::string construct is the only one that will allow you to do that cleanly. With the old C API you would have to use strcat() but that would not be valid unless you had allocated enough memory to do that.
std::string manages the memory for you so you do not have to call malloc() etc. Actually allocator, the 3rd template parameter, manages the memory underneath - basic_string makes the requests for how much memory it needs but is decoupled from the actual memory allocation technique used, so you can use memory pools, etc. for efficiency even with std::string.
In addition basic_string does not actually perform many of the string operations which are done instead through char_traits. (This allows it to use specialist C-functions underneath which are well optimised).
std::string therefore is the best way to manage your strings when you are handling dynamic strings constructed and passed around at run-time (rather than just literals).
You will rarely use a string* (a pointer to a string). If you do so it would be a pointer to an object, like any other pointer. You would not be able to allocate it the way you did.
C++ string class is encapsulating of char C-like string. It is a much more convenient (http://www.cplusplus.com/reference/string/string/).
for legacy you always can "extract" char pointer from string variable to deal with it as char pointer:
char * cstr;
string str ("Please split this phrase into tokens");
cstr = new char [str.size()+1];
strcpy (cstr, str.c_str()); //here str.c_str() generate null terminated char* pointer
//str.data() is equivalent, but without null on end
Yes, char* is the pointer to an array of character, which is a string. string * is the pointer to an array of std::string (which is very rarely used).
string *name = "name";
"name" is a const char*, and it would never been converted to a std::string*. This will results compile error.
The valid declaration:
string name = "name";
or
const char* name = "name"; // char* name = "name" is valid, but deprecated
string *name = "name";
Does not compile in GCC.

std::string vs string literal for functions

I was wondering, I normally use std::string for my code, but when you are passing a string in a parameter for a simply comparison, is it better to just use a literal?
Consider this function:
bool Message::hasTag(string tag)
{
for(Uint tagIndex = 0; tagIndex < m_tags.size();tagIndex++)
{
if(m_tags[tagIndex] == tag)
return 0;
}
return 1;
}
Despite the fact that the property it is making a comparison with is a vector, and whatever uses this function will probably pass strings to it, would it still be better to use a const char* to avoid creating a new string that will be used like a string literal anyway?
If you want to use classes, the best approach here is a const reference:
bool Message::hasTag(const string& tag);
That way, redudant copying can be minimized and it's made clear that the method doesn't intend to modify the argument. I think a clever compiler can emit pretty good code for the case when this is called with a string literal.
Passing a character pointer requires you to use strcmp() to compare, since if you start comparing pointers directly using ==, there will be ... trouble.
Short answer: it depends.
Long answer: std::string is highly useful because it provides a lot of utility functions for strings (searching for substrings, extracting substrings, concatenating strings etc.). It also manages the memory for you, so the ownership of the string cannot be confused.
In your case, you don't need either. You just need to know whether any of the objects in m_tags matches the given string. So for your case, writing the function using a const char *s is perfectly sufficient.
However, as a foot note: you almost always want to prefer std::string over (const) char * when talking about return values. That's because C strings have no ownership semantics at all, so a function returning a const char * needs to be documented very carefully, explaining who owns the pointed to memory (caller or callee) and, in case the callee gets it, how to free it (delete[], delete, free, something else).
I think it would be enough to pass an reference rather than value of string. I mean:
bool Message::hasTag(const string& tag)
That would copy only the reference to the original string value. Which must be created somwhere anyway, but outside of the function. This function would not copy its parameter whatsoever.
Since m_tags is a vector of strings anyway (I suppose), const string& parameter would be better idea.