Pointers in C++ - A beginner's dilemma

Pointers in C++ - A beginner's dilemma - c++

Pointers have always made me blank about the logic I intend to use in code, If someone can help me understand a few concepts that would be really helpful. Here's a code snippet from my program,
vector <char> st;
char *formatForHtml(string str, string htmlTag)
{
string strBegin;
strBegin = "<";
strBegin.append(htmlTag);
strBegin.append(">");
strBegin.append(str);
string strEnd = "</";
strEnd.append(htmlTag);
strEnd.append(">");
strBegin.append(strEnd);
st.resize(strBegin.size());
for (int i =0;i <strBegin.size();i++) {
st[i] = strBegin.at(i);
}
return &st[0];
}
In the code above if I have to return address of st[0], I have to write the function of type char *. I need to know the reason to do so, also if address is the integer value why can I not define function as an int type?
P.S. It's a beginner level doubt.

You don't tell us what st is, so we can't tell whether the code is
totally incorrect, or just bad design. If st is a typo for str
(just guessing, since str isn't used anywhere), then you have
undefined behavior, and the program is incorrect, since you're returning
a pointer into an object which has been destructed. At any rate, a
function like formatForHtml should return an std::string, as a
value, and not a pointer (and certainly not a pointer to a non-const).
I might add that you don't use a loop to copy string values character by
character. std::string acts as a normal value type, so you can just
assign: st = strBegin;.
EDIT:
Just for the record, I've since reexamined your code. The natural way
of writing it would be:
std::string
formatForHtml( std::string const& cdata, std::string const& tag )
{
return '<' + tag + '>' + cdata + "</" + tag + '>';
}
No pointers (at least not visible---in fact, "+ operator), and full use of
std::strings facilities.

I have to write the function of type 'char *' I need to know the reason to do so
There's no reason to do so. Just return a std::string.
From the looks of your code, st is a std::vector<char>. Since std::vector has continuous memory, &st[0] is the address of the first char in the vector, and thus points to the not null-terminated string that the vector represents, which isn't very helpful outside the function because you can't find the length.

That's awkward code, you're expecting 'str' as an argument, but then you return its memory location? Why not return the string itself?
Your code there has a few typos. (Did you mean 'str' when you typed 'st'? If not, then where is 'st' defined?)
As for why you can't define the function as "int" type, that is because an "int" is a different type to "pointer to char".
Personally, I would return 'string', and just have a 'return str' at the end there.

Pointer is pointer, not integer value. Code is right if and only if st is global variable or declared as static in this function. But return string is preferably, than return pointer.

Why don't you use string as return type for your function?
First: a pointer is an adress of something lying in memory, so you can't replace it with an integer (only with some dirty tricks).
Second: &st[0] is the adress of the internal buffer the string uses to store its content. When st goes out of scope (or is reused for the next call), it will be overwritten or given back to the heap manager, so the one calling this function will end with some garbage.
Btw, most of your function does unnessesary work which the string class can do by itself (for example copiing strBegin to st).

Simple.
string str
is something that contains a bunch characters.
str[0]
return the first character.
Therefore,
&str[0]
gives you the address of the first character, therefore it's a "char *" --- "char" being whatever the pointer points to, "*" being the "address of" part of the pointer. The type of the pointer is about the object being pointed to.
Also, addresses may be just "integers" on the hardware level (everything is an integer... except for integers themselves, they are a bunch of bools), but that doesn't mean anything for a higher level language. For a higher level language, a pointer is a pointer, not an integer.
Treating one as another is almost always an error, and C++11 even added the nullptr keyword so we don't have to rely on the one valid "pointer is an integer" case.
And btw, your code won't work. "str" only exists for the duration of the function, so when the function returns "str" ceases to exist, which means the pointer you have returned points into a big black nowhere.

Related

Difference between * and & when passing a string to any function?

This is a simple operator overloading program I learned and I can't understand exactly why the parameterized constructor is using '*st' and not just 'st' or '&st'.
Now, I know the difference between passing by reference, address and value.
In the given example, I passed a string. If it was passing by reference, the argument in the parameterized constructor would have been '&st'. I'm not sure if it is passing the string by address.
Basically, I don't know how the code is working there. Please explain that and also why using '&st' in place of '*st' isn't working.
class tyst
{
int l;
char *p;
public:
tyst(){l=0;p=0;}
tyst(char *st)
{
l=strlen(st);
p = new char[l];
strcpy(p,st);
}
void lay();
tyst operator + (tyst b);
};
void tyst::lay()
{
cout<<p<<endl;
}
tyst tyst::operator+(tyst b)
{
tyst temp;
temp.l=l+b.l;
temp.p = new char[temp.l];
strcpy(temp.p,p);
strcat(temp.p,b.p);
return temp;
}
int main()
{
tyst s1("Markus Pear");
tyst s2("son Notch");
tyst s3;
s3=s1+s2;
s3.lay();
return 0;
}
So, I'd really appreciate if anyone can clear this up for me.

st is a C-style string. Passed by value
tyst(char st)
you would merely get a single character.
Passed by reference
tyst(char & st)
You would also get only a single character, but you could modify it. Not so useful in this case. You could also pass in a reference to a pointer, but I don't see much use to that, either.
How this is working
tyst(char * st)
says that the function will take a pointer, and that pointer may be pointing to a single character, the first of an unknown number of characters, absolutely nothing, or complete garbage. The last two possibilities are why use of references is preferred over pointers, but you can't use a reference for this. You could however use a std::string. In C++, this would often be the preferred approach.
Inside tyst, the assumption that an unknown, but null-terminated, number of characters is pointed at, and this is almost what is being provided. Technically, what you are doing here with
tyst s1("Markus Pear");
is illegal. "Markus Pear" is a string literal. It may be stored in non-writeable memory, so it is a const char *, not a char *.

The constructor is expecting a pointer to a c-string (null terminated character array). Applying pointer arithmetic to it will allow access to the entire string. Note that new[] returns a pointer to the first element; it is (one way) how pointers are used.
Aside from the syntax errors, it makes no sense to pass a single character by reference to the class. It isn't interested in a single character, it is interested in where the beginning of the array is.
It would be like asking somebody what their home address is, and they give you a rock from their lawn.

Declaring arrays: name of the array with spaces

I'm trying to understand a part of my professor's code. He gave an example for a hw assignment but I'm not sure how to understand this part of the code..
Here is the code:
void addTask(TaskOrAssignment tasks[], int& taskCount, char *course, char *description, char *dueDate)
{
tasks[taskCount].course = course;
tasks[taskCount].description = description;
tasks[taskCount].dueDate = dueDate;
taskCount++;
}
Question: Is "tasks[taskCount].course = course;" accessing or declaring a location for char course?
I hope I could get this answered and I'm pretty new to this site too.
Thank you.

tasks[taskCount].course = course;
Let's break this down a piece at a time. First of all, we are using the assignment operator (=) to assign a value of one variable to another variable.
The right hand side is pretty simple, just the variable named course which is declared as a char*.
It is assigned to the variable tasks[taskCount].course. If you look at the parameters of the method, you can see that tasks is declared as an array of TaskOrAssignment objects. So tasks[taskCount] refers to one of the elements of this array. The .course at the end refers to a field named course in that object. Assuming that this code compiles, that field is declared as a char* in the TaskOrAssignment class.
Most likely, both course variables represent a string of characters. (This originates from C.) When all is said and done, both course and tasks[taskCount].course point to the same string buffer.

course is a char pointer, it points to the block of memory in stack or heap storing the '\0' terminated string.
tasks[taskCount].course is also a char pointer, the assignment just lets tasks[taskCount].course point to same memory address that course does.

course is a char* (a pointer to char). Presumably, the course member of TaskOrAssignment is also a char*. All that line does is assign the value of course to the member course of the taskCountth element in the array.
Presumably, the course argument is intended to be a pointer to the first character in a null-terminated C-style string. So after this assignment, the course member of the array element will point at that same C-style string. However, of course, the pointer could really be pointing anywhere.

C++ const cast, unsure if this is secure

It maybe seems to be a silly question but i really need to clarify this:
Will this bring any danger to my program?
Is the const_cast even needed?
If i change the input pointers values in place will it work safely with std::string or will it create undefined behaviour?
So far the only concern is that this could affect the string "some_text" whenever I modify the input pointer and makes it unusable.
std::string some_text = "Text with some input";
char * input = const_cast<char*>(some_text.c_str());
Thanks for giving me some hints, i would like to avoid the shoot in my own foot

As an example of evil behavior: the interaction with gcc's Copy On Write implementation.
#include <string>
#include <iostream>
int main() {
std::string const original = "Hello, World!";
std::string copy = original;
char* c = const_cast<char*>(copy.c_str());
c[0] = 'J';
std::cout << original << "\n";
}
In action at ideone.
Jello, World!
The issue ? As the name implies, gcc's implementation of std::string uses a ref-counted shared buffer under the cover. When a string is modified, the implementation will neatly check if the buffer is shared at the moment, and if it is, copy it before modifying it, ensuring that other strings sharing this buffer are not affected by the new write (thus the name, copy on write).
Now, with your evil program, you access the shared buffer via a const-method (promising not to modify anything), but you do modify it!
Note that with MSVC's implementation, which does not use Copy On Write, the behavior would be different ("Hello, World!" would be correctly printed).
This is exactly the essence of Undefined Behavior.

To modify an inherently const object by casting away its constness using const_cast is an Undefined Behavior.
string::c_str() returns a const char *, i.e: a pointer to a constant c-style string. Technically, modifying this will result in Undefined Behavior.
Note, that the use of const_cast is when you have a const pointer to a non const data and you wish to modify the non-constant data.

Simply casting will not bring forth an undefined behavior. Modifying the data pointed at, however, will. (Also see ISO 14882:98 5.2.7-7).
If you want a pointer to modifiable data, you can have a
std::vector<char> wtf(str.begin(), str.end());
char* lol= &wtf[0];

The std::string manages it's own memory internally, which is why it returns a pointer to that memory directly as it does with the c_str() function. It makes sure it's constant so that your compiler will warn you if you try to do modifiy it.
Using const_cast in that way literally casts away such safety and is only an arguably acceptable practice if you are absolutely sure that memory will not be modified.
If you can't guarantee this then you must copy the string and use the copy.; it's certainly a lot safer to do this in any event (you can use strcpy).

See the C++ reference website:
const char* c_str ( ) const;
"Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
A terminating null character is automatically appended.
The returned array points to an internal location with the required storage space for this sequence of characters plus its terminating null-character, but the values in this array should not be modified in the program and are only guaranteed to remain unchanged until the next call to a non-constant member function of the string object."

Yes, it will bring danger, because
input points to whatever c_str happens to be right now, but if some_text ever changes or goes away, you'll be left with a pointer that points to garbage. The value of c_str is guaranteed to be valid only as long as the string doesn't change. And even, formally, only if you don't call c_str() on other strings too.
Why do you need to cast away the const? You're not planning on writing to *input, are you? That is a no-no!

This is a very bad thing to do. Check out what std::string::c_str() does and agree with me.
Second, consider why you want a non-const access to the internals of the std::string. Apparently you want to modify the contents, because otherwise you would use a const char pointer. Also you are concerned that you don't want to change the original string. Why not write
std::string input( some_text );
Then you have a std::string that you can mess with without affecting the original, and you have std::string functionality instead of having to work with a raw C++ pointer...

Another spin on this is that it makes code extremely difficult to maintain. Case in point: a few years ago I had to refactor some code containing long functions. The author had written the function signatures to accept const parameters but then was const_casting them within the function to remove the constness. This broke the implied guarantee given by the function and made it very difficult to know whether the parameter has changed or not within the rest of the body of the code.
In short, if you have control over the string and you think you'll need to change it, make it non-const in the first place. If you don't then you'll have to take a copy and work with that.

it is UB.
For example, you can do something like this this:
size_t const size = (sizeof(int) == 4 ? 1024 : 2048);
int arr[size];
without any cast and the comiler will not report an error. But this code is illegal.
The morale is that you need consider action each time.

C++, strings, and pointers

I know this is rudimentary but I know nothing about C++. Is it necessary to do:
string *str = getNextRequest();
rather than
string str = getNextRequest();
in order to reference str later on in the same block of code? If so, what type of error would the latter produce?

That depends entirely on the return type of getNextRequest.
Strings can be used and reused throughout the scope they're declared in. They essentially contain a mutable C string and some handling information, helper methods, etc.
You can, very safely, return a string from a function and the compiler will make a copy or move it as necessary. That string (str here) can then be used normally, with no worries about using out-of-scope locals or such.
There are times when a pointer to a string is needed, but unless you're using it as an out parameter, those tend to be rare and indicate some design oddity.

Which you use depends on what getNextRequest() returns. If it returns a string *, then use the first line, if it returns string then use the second.
So if the declaration of getNextRequest is like this:
string getNextRequest();
Then
string str = getNextRequest();
is correct. If the declaration is like this:
string *getNextRequest();
Then you can go with
string *str = getNextRequest();
or
string str = *getNextRequest();

string str = getNextRequest();
will create a copy of the string returned by getNextRequest. If you want to alter the contents of str and wish that these changes are also within the string returned by getNextRequest you have to return a pointer or reference.
If this is what you want, then you should define getNextRequest as:
string& getNextRequest()
and use it like:
string& str = getNextRequest();

string str* = getNextRequest();
As noted by #dasblinkenlight, that would be a syntax error
But to answer your original question, is it necessary? No. In general, you should not use pointers unless you must.
Especially with the STL. The STL is not designed to be used with pointers--it does dynamic memory management for you. Unless you have a good reason, you should always use vector<int> v and string s rather than vector<int>* or string*.

You will probably need to provide a little bit more information regarding this function getNextRequest(). Where is it from? Library? API? Purpose?
If the return type of the function is a string* (pointer to str), then the string has been allocated to the "heap". This means, it does not matter which block of code you reference the string from. As long as you maintain the pointer, you will be able to access it.
If the return type of the function is simply a string (meaning not a pointer), it will return the value, not the address of str. In essence, you would be "copying" the string to your new variable. In this case, the variable would be allocated on the stack, and you would only be able to reference it when in the scope of the code block.

C++ Why isn't call by reference needed for strcpy()

I have a homework assignment with a number of questions. One is asking why the strcpy() function doesn't need the call by reference operator for CStrings. I've looked through the book numerous times and I can't, for the life of me, find the answer. Can anyone help explain this to me?
It is an array of sorts so I would think you would need the call by reference.

strcpy() takes a pointer to char.
Thus you don't pass the "string" as a parameter but only the address of its first character.
So basically you have something like this:
void func(const char* str);
const char* str = "abc";
func(str); // Here str isn't passed by reference but since str is a pointer to the first character ('a' here), you don't need a reference.
Passing a pointer is fast. On a 32 bits architecture, a pointer takes 32 bits, whatever the length of the pointed string.

If you mean class CString, then in other words the question asks you:
Why does this compile?
CString sExample;
char buffer[LARGE_ENOUGH];
strcpy(buffer, sExample);
The answer is, because class CString defines an operator const char* and therefore can be converted to the type of strcpy's second argument.
I 'm not sure if this is what you mean though.

This a problem of terminology, mostly.
An "object" (I use the term as designing "a chunk of RAM") is passed by value when the called function gets a copy of the chunk. It is passed by reference when the called function gets a way to access the one and only chunk.
Consider this:
void f(int x)
{
x = 42;
}
void g()
{
int y = 54;
f(y);
// here "y" still has value 54
}
Here, the function f() modifies x, but that is its own x, a variable which contains a copy of the contents of the y variable of g(). What f() does with its x does not impact what the y of g() contains. The variable is then passed by value.
C++ has a notion of reference which goes like this:
void f(int& x)
{
x = 42;
}
void g()
{
int y = 54;
f(y);
// here "y" now has value 42
}
Here, the special construction (with the "&") instructs the C++ compiler to play some hidden tricks so that the x known by f() is actually a kind of alias on the y variable used by g(). There is only one variable: the x and the y designate the same chunk of RAM. The variable is here passed by reference.
Now, consider a "C string". In C (and C++), a string is just a bunch of char values, the last of which having value 0 (this is the conventional string terminator). The code does not handle strings directly; it uses pointers. A pointer is a value which actually designates an emplacement in RAM. The pointer is not the string; the pointer is a kind of number (usually on 32 or 64 bits, it depends on the processor type and the operating system) which tells where in RAM is the first char of the string. So when you call strcpy() you actually give it pointers (one for the destination buffer, one for the source string). Those pointers are unmodified: the source string and the destination buffers are not moved in the process; only the contents of the string are copied into the destination buffer.
Hence, strcpy() needs not have access to the variables which contain the pointers in the caller code. strcpy() only needs to know where the destination buffer and the source strings are in RAM. Giving a copy of the pointer values to strcpy() is enough. Hence, those pointers are passed by value.
Note that in the presence of pointers, there are two objects to consider: the variable which contains the pointer, and the chunk of RAM which is pointed to. The pointer itself is passed by value (strcpy() receives its own copy of the variable which contains the pointer to the destination buffer). We can say that the pointed-to chunk of RAM (the destination buffer) is "passed by reference" since it is not duplicated in the process and the called function (strcpy()) can modify it. The point here is that the term "reference" has two distinct meanings:
The C++ syntax meaning: "reference" designates the special construction with the "&" that I have described above.
The language theory formal meaning: "reference" designates a way by which a value is indirectly designated, so that caller and callee may access the same chunk of RAM under distinct names. With that meaning, passing by value a pointer to a called function is equivalent to passing by reference the chunk of RAM to which the pointer points.
A C++ "reference" (first meaning) is a syntaxic way to pass "by reference" (second meaning) a variable.

Well in the case you mean c-strings (char*) you don't need a call by reference because the string itself is a pointer. So string copy knows where to/from where to copy the string.

Because strcpy works with char* which are pointers. The pointer is passed by value, and strcpy uses that pointer to access the indiviual characters in the target string and change them. Compare that to passing an integer by value - the function can't change the original integer.
Understanding how char* strings are not like integers is vital to you not going crazy during your C++ course. Well done for your prof making you face it.

Because in C when calling functions, arrays are passed as the address of the first element, which is equivalent of calling by reference.
See Peter Van Der Linden Expert C programming, Deep secrets book.

automatic type conversion, is the answer I guess they're looking for. Looking that turn up might give you some help.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Pointers in C++ - A beginner's dilemma - c++

Pointer is pointer, not integer value. Code is right if and only if st is global variable or declared as static in this function. But return string is preferably, than return pointer.

Related

Difference between * and & when passing a string to any function?

Declaring arrays: name of the array with spaces

C++ const cast, unsure if this is secure

C++, strings, and pointers

C++ Why isn't call by reference needed for strcpy()

Categories

Resources