c++ pass by value segmentation fault - c++

I have a function f() which receives a char* p and gives a const char* to it.
void f(char *p) {
string s = "def";
strcpy(p, s.c_str());
}
In the main() below I expect to get q = "def".
int main(){
char *q = "abc";
f(q);
cout << q << endl;
}
By running this I get segmentation fault and as I am new in C++ I don't understand why.
I also get a segmentation fault when I do not initialize q thus:
int main(){
char *q;
f(q);
cout << q << endl;
}
Knowing that the function's parameter and the way it's called must not change. Is there any work around that I can do inside the function? Any suggestions?
Thanks for your help.

You are trying to change a string literal. Any attemp to change a string literal results in undefined behaviour of the program.
Take into account that string literals have types of constant character arrays. So it would be more correct to write
const char *q = "abc";
From the C++ Standard (2.14.5 String literals)
8 Ordinary string literals and UTF-8 string literals are also referred
to as narrow string literals. A narrow string literal has type
“array of n const char”, where n is the size of the string as
defined below, and has static storage duration
You could write your program the following way
//...
void f(char *p) {
string s = "def";
strcpy(p, s.c_str());
}
//..
main(){
char q[] = "abc";
f(q);
cout << q << endl;
}
If you need to use a pointer then you could write
//...
void f(char *p) {
string s = "def";
strcpy(p, s.c_str());
}
//..
main(){
char *q = new char[4] { 'a', 'b', 'c', '\0' };
f(q);
cout << q << endl;
delete []q;
}

This is an issue that, in reality, should fail at compilation time but for really old legacy reasons they allow it.
"abc" is not not a mutable string and therefore it should be illegal to point a mutable pointer to it.
Really any legacy code that does this should be forced to be fixed, or have some pragma around it that lets it compile or some permissive flag set in the build.
But a long time ago in the old days of C there was no such thing as a const modifier, and literals were stored in char * parameters and programmers had to be careful what they did with them.
The latter construct, where q is not initialised at all is an error because now q could be pointing anywhere, and is unlikely to be pointing to a valid memory place to write the string. It is actually undefined behaviour, for obvious reason - who knows where q is pointing?
The normal construct for such a function like f is to request not only a pointer to a writable buffer but also a maximum available size (capacity). Usually this size includes the null-terminator, sometimes it might not, but either way the function f can then write into it without an issue. It will also often return a status, possibly the number of bytes it wanted to write. This is very common for a "C" interface. (And C interfaces are often used even in C++ for better portability / compatibility with other languages).
To make it work in this instance, you need at least 4 bytes in your buffer.
int main()
{
char q[4];
f(q);
std::cout << q << std::endl;
}
would work.
Inside the function f you can use std::string::copy to copy into the buffer. Let's modify f.
(We assume this is a prototype and in reality you have a meaningful name and it returns something more meaningful that you retrieve off somewhere).
size_t f( char * buf, size_t capacity )
{
std::string s = "def";
size_t copied = s.copy( buf, capacity-1 ); // leave a space for the null
buf[copied] = '\0'; // std::string::copy doesn't write this
return s.size() + 1; // the number of bytes you need
}
int main()
{
char q[3];
size_t needed = f( q, 3 );
std::cout << q << " - needed " << needed << " bytes " << std::endl;
}
Output should be:
de needed 4 bytes
In your question you suggested you can change your function but not the way it is called. Well in that case, you actually have only one real solution:
void f( const char * & p )
{
p = "def";
}
Now you can happily do
int main()
{
const char * q;
f( q );
std::cout << q << std::endl;
}
Note that in this solution I am actually moving your pointer to point to something else. This works because it is a static string. You cannot have a local std::string then point it to its c_str(). You can have a std::string whose lifetime remains beyond the scope of your q pointer e.g. stored in a collection somewhere.

Look at the warnings you get while compiling your code (and if you don’t get any, turn up the warning levels or get a better compiler).
You will notice that despite the type declaration, the value of q is not really mutable. The compiler was humoring you because not doing so would break a lot of legacy code.

You can't do that because you assigned a string literal to your char*. And this is memory you can't modify.

With your f, You should do
int main(){
char q[4 /*or more*/];
f(q);
std::cout << q << std::endl;
}

The problem is that you are trying to write on a read-only place in the process address space. As all the string literals are placed in read-only-data. char *q = "abc"; creates a pointer and points towards the read-only section where the string literal is placed. and when you copy using strcpy or memcpy or even try q[1] = 'x' it attempts to write on a space which is write protected.
This was the problem among many other solutions one can be
main(){
char *q = "abc"; \\ the strings are placed at a read-only portion
\\ in the process address space so you can not write that
q = new char[4]; \\ will make q point at space at heap which is writable
f(q);
cout << q << endl;
delete[] q;
}
the initialization of q is unnecessary here. after the second line q gets a space of 4 characters on the heap (3 for chars and one for null char). This would work but there are many other and better solutions to this problem which varies from situation to situation.

Related

Variable always empty

I'm having troubles initializing a global variable. My C++ is a bit rusty so I can't remember the reason why my code isn't working.
file.cpp
const char * write_path = &(std::string(getenv("FIFO_PATH")) + "/pythonread_fifo")[0];
int main(int argc, char**argv)
{
std::cout << "printing a string: ";
std::cout << (std::string(getenv("FIFO_PATH")) + "/pythonread_fifo\n");
std::cout << "printing a const char*: ";
std::cout << &(std::string(getenv("FIFO_PATH")) + "/pythonread_fifo")[0] << std::endl;
std::cout << "printing write_path:";
std::cout << write_path;
std::cout << write_path << std::endl;
std::cout << "printing FIFO_PATH:" << std::string(getenv("FIFO_PATH"));
}
As a premise: FIFO_PATH has been correctly added to bashrc, and it works, however, when I launch this program this is the output:
printing a string: /home/Processes/FIFOs/pythonread_fifo
printing a const char*: /home/Processes/FIFOs/pythonread_fifo
printing write_path:
printing FIFO_PATH:/home/Processes/FIFOs
As you can see write_path is completely empty.
What's even stranger to me is that if I define write_path as:
const char * write_path = "/home/Processes/FIFOs/pythonread_fifo";
Then write_path is no longer empty, it's correctly initialized and printed.
How can I solve this? Or at the very least, why is this happening?
EDIT: The issue IS NOT related to write_path being global. I placed the definition inside the main and when I try to print write_path, it's still empty
write_path is initialized as a pointer pointing to the 1st element of a temporary std::string, which will be destroyed immediately after the full expression, left write_path dangled, dereference on it leads to UB.
You can use std::string directly, or use a named std::string, then get the pointer from it.
std::string s = std::string(getenv("FIFO_PATH")) + "/pythonread_fifo";
const char * write_path = &s[0]; // or s.c_str()
On the other hand,
const char * write_path = "/home/mverrocc/dss_cp/dss-cp/Processes/FIFOs/pythonread_fifo";
works fine, the c-style string literal has static storage duration and exists in memory for the life of the program, then write_path is always a valid pointer.
const char * write_path = &(std::string(getenv("FIFO_PATH")) + "/pythonread_fifo")[0];
constructs a temporary std::string, takes the address of it's first character then discards the string, thus deleting the underlying char array. This is UB.
Better just use std::string and use c_str() when you need a const char*
You're creating a temporary std::string object, and get a pointer to its first character. This pointer will become invalid as soon as the expression &(std::string(getenv("FIFO_PATH")) + "/pythonread_fifo")[0] ends, when the temporary object is destructed.
Use a std::string object for write_path as well, define it inside the main function, and use the c_str function of the string when you need a null-terminated string.

Can you safely get a pointer to a string from its c_str() const char*?

I have a const char pointer which I know for sure came from a string. For example:
std::string myString = "Hello World!";
const char* myCstring = myString.c_str();
In my case I know myCstring came from a string, but I no longer have access to that string (I received the const char* from a function call, and I cannot modify the function's argument list).
Given that I know myCstring points to contents of an existing string, is there any way to safely access the pointer of the parent string from which it originated? For example, could I do something like this?
std::string* hackyStringPointer = myCstring - 6; //Along with whatever pointer casting stuff may be needed
My concern is that perhaps the string's contents possibly cannot be guaranteed to be stored in contiguous memory on some or all platforms, etc.
Given that I know myCstring points to contents of an existing string, is there any way to safely access the pointer of the parent string from which it originated?
No, there is no way to obtain a valid std::string* pointer from a const char* pointer to character data that belongs to a std::string.
I received the const char* from a function call, and I cannot modify the function's argument list
Your only option in this situation would be if you can pass a pointer to the std::string itself as the actual const char* pointer, but that will only work if whatever is calling your function does not interpret the const char* in any way (and certainly not as a null-terminated C string), eg:
void doSomething(void (*func)(const char*), const char *data)
{
...
func(data);
...
}
void myFunc(const char *myCstring)
{
std::string* hackyStringPointer = reinterpret_cast<std::string*>(myCstring);
...
}
...
std::string myString = "Hello World!";
doSomething(&myFunc, reinterpret_cast<char*>(&myString));
You cannot convert a const char* that you get from std::string::c_str() to a std::string*. The reason you can't do this is because c_str() returns a pointer to the string data, not the string object itself.
If you are trying to get std::string so you can use it's member functions then what you can do is wrap myCstring in a std::string_view. This is a non-copying wrapper that lets you treat a c-string like it is a std::string. To do that you would need something like
std::string_view sv{myCstring, std::strlen(myCstring)};
// use sv here like it was a std::string
Yes (it seems), although I agree that if I need to do this it's likely a sign that my code needs reworking in general. Nevertheless, the answer seems to be that the string pointer resides 4 words before the const char* which c_str() returns, and I did recover a string* from a const char* belonging to a string.
#include <string>
#include <iostream>
std::string myString = "Hello World!";
const char* myCstring = myString.c_str();
unsigned int strPtrSize = sizeof(std::string*);
unsigned int cStrPtrSize = sizeof(const char*);
long strAddress = reinterpret_cast<std::size_t>(&myString);
long cStrAddress = reinterpret_cast<std::size_t>(myCstring);
long addressDifference = strAddress - cStrAddress;
long estStrAddress = cStrAddress + addressDifference;
std::string* hackyStringPointer = reinterpret_cast<std::string*>(estStrAddress);
cout << "Size of String* " << strPtrSize << ", Size of const char*: " << cStrPtrSize << "\n";
cout << "String Address: " << strAddress << ", C String Address: " << cStrAddress << "\n";
cout << "Address Difference: " << addressDifference << "\n";
cout << "Estimated String Address " << estStrAddress << "\n";
cout << "Hacky String: " << *hackyStringPointer << "\n";
//If any of these asserts trigger on any platform, I may need to re-evaluate my answer
assert(addressDifference == -4);
assert(strPtrSize == cStrPtrSize);
assert(hackyStringPointer == &myString);
The output of this is as follows:
Size of String* 4, Size of const char*: 4
String Address: 15725656, C String Address: 15725660
Address Difference: -4
Estimated String Address: 15725656
Hacky String: Hello World!
It seems to work so far. If someone can show that the address difference between a string and its c_str() can change over time on the same platform, or if all members of a string are not guaranteed to reside in contiguous memory, I'll change my answer to "No."
This reference says
The pointer returned may be invalidated by further calls to other member functions that modify the object.
You say you got the char* from a function call, this means you do not know what happens to the string in the mean time, is that right? If you know that the original string is not changed or deleted (e.g. gets out of scope and thus is destructed) then you can still use the char*.
Your example code however has multiple problems. You want to do this:
std::string* hackyStringPointer = myCstring - 6;
but I think you meant
char* hackyStringPointer = myCstring;
One, you cannot cast the char* to a string* and second you do not want to go BEFORE the start of the char*. The char* points to the first character of the string, you can use it to access the characters up to the trailing 0 character. But you should not go before the first or after the trailing 0 character though, as you do not know what is in that memory or if it even exists.

Why printing the array of strings does print first characters only?

Please explain the difference in the output of two programs.
cout << branch[i] in first program gives output as:
Architecture
Electrical
Computer
Civil
cout << *branch[i] in second program gives output as:
A
E
C
C
Why?
What is the logic behind *branch[i] giving only first character of each word as output and branch[i] giving full string as an output?
Program 1
#include <iostream>
using namespace std;
int main()
{
const char *branch[4] = { "Architecture", "Electrical", "Computer", "Civil" };
for (int i=0; i < 4; i++)
cout << branch[i] << endl;
system("pause");
return 0;
}
Program 2
#include <iostream>
using namespace std;
int main()
{
const char *branch[4] = { "Architecture", "Electrical", "Computer", "Civil" };
for (int i=0; i < 4; i++)
cout << *branch[i] << endl;
system("pause");
return 0;
}
When you declare a const char* with assignment operator, for example:
const char* some_string = "some text inside";
What actually happens is the text being stored in the special, read-only memory with added the null terminating char after it ('\0'). It happens the same when declaring an array of const char*s. Every single const char* in your array points to the first character of the text in the memory.
To understand what happens next, you need to understand how does std::cout << work with const char*s. While const char* is a pointer, it can point to only on thing at a time - to the beginning of your text. What std::cout << does with it, is it prints every single character, including the one that is being pointed by mentioned pointer until the null terminating character is encountered. Thus, if you declare:
const char* s = "text";
std::cout << s;
Your computer will allocate read-only memory and assign bytes to hold "text\0" and make your s point to the very first character (being 't').
So far so good, but why does calling std::cout << *s output only a single character? That is because you dereference the pointer, getting what it points to - a single character.
I encourage you to read about pointer semantics and dereferencing a pointer. You'll then understand this very easily.
If, by any chance, you cannot connect what you have just read here to your example:
Declaring const char* branch[4]; you declare an array of const char*s. Calling branch[0] is replaced by *(branch + 0), which is derefecencing your array, which results in receiving a single const char*. Then, if you do *branch[0] it is being understood as *(*(branch + 0)), which is dereferencing a const char* resulting in receiving a single character.
branch[i] contains a char* pointer, which is pointing to the first char of a null-terminated string.
*branch[i] is using operator* to dereference that pointer to access that first char.
operator<< is overloaded to accept both char and char* inputs. In the first overload, it prints a single character. In the second overload, it outputs characters in consecutive memory until it reaches a null character.
This is because of operators precedences.
Subscript operator [] has a higher precedence than an indirection operator *.
So branch[i] returns const char * and *branch[i] returns const char.
*branch[i] prints a single char located at the address pointed to by branch[i].
branch[i] prints the whole char* array starting with the address pointed to by branch[i].

Initialization of pointers in c++

I need to clarify my concepts regarding the basics of pointer initialization in C++. As per my understanding, a pointer must be assigned an address before putting some value using the pointer.
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This would probably show the correct output (10) but this may cause issue in larger programs since p initially had garbage address which can be anything & may later be used somewhere else in the program as well.So , I believe this is incorrrect, the correct way is:
int *p;
int x=10;
p=&x; //appropriate
cout << *p <<"\n";
My question is, if the above understanding is correct, then does the same apply on char* as well?:
const char *str="hello"; // inappropriate
cout << str << "\n";
//OR
const string str1= "hello";
const char str2[6] ="world";
const char *str=str1; //appropriate
const char *st=str2; //appropriate
cout << str << st << "\n";
Please advice
Your understanding of strings is incorrect.
Lets take for example the very first line:
const char *str="hello";
This is actually correct. A string literal like "hello" is turned into a constant array by the compiler, and like all arrays it can decay to a pointer to its first element. So what you are doing is making str point to the first character of the array.
Then lets continue with
const string str1= "hello";
const char *str=str1;
This is actually wrong. A std::string object have no casting operator defined to cast to a const char *. The compiler will give you an error for this. You need to use the c_str function go get a pointer to the contained string.
Lastly:
const char str2[6] ="world";
const char *st=str2; //appropriate
This is really no different than the first line when you declare and initialize str. This is, as you say, "appropriate".
About that first example with the "inappropriate" pointer:
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This is not only "inappropriate", this leads to undefined behavior and may actually crash your program. Also, the correct term is that the value of p is indeterminate.
When I declare a pointer
int *p;
I get an object p whose values are addresses. No ints are created anywhere. The thing you need to do is think of p as being an address rather than being an int.
At this point, this isn't particularly useful since you have no addresses you could assign to it other than nullptr. Well, technically that's not true: p itself has an address which you can get with &p and store it in an int**, or even do something horrible like p = reinterpret_cast<int*>(&p);, but let's ignore that.
To do something with ints, you need to create one. e.g. if you go on to declare
int x;
you now have an int object whose values are integers, and we could then assign its address to p with p = &x;, and then recover the object from p via *p.
Now, C style strings have weird semantics — the weirdest aspect being that C doesn't actually have strings at all: it's always working with arrays of char.
String literals, like "Hello!", are guaranteed to (act1 like they) exist as an array of const char located at some address, and by C's odd conversion rules, this array automatically converts to a pointer to its first element. Thus,
const char *str = "hello";
stores the address of the h character in that character array. The declaration
const char str2[6] ="world";
works differently; this (acts1 like it) creates a brand new array, and copies the contents of the string literal "world" into the new array.
As an aside, there is an obsolete and deprecated feature here for compatibility with legacy programs, but for some misguided reason people still use it in new programs these days so you should be aware of it and that it's 'wrong': you're allowed to break the type system and actually write
char *str = "hello";
This shouldn't work because "hello" is an array of const char, but the standard permits this specific usage. You're still not actually allowed to modify the contents of the array, however.
1: By the "as if" rule, the program only has to behave as if things happen as I describe, but if you peeked at the assembly code, the actual way things happen can be very different.

Pointer of a character in C++

Going by the books, the first cout line should print me the address of the location where the char variable b is stored, which seems to be the case for the int variable a too. But the first cout statement prints out an odd 'dh^#' while the second statement correctly prints a hex value '
ox23fd68'. Why is this happening?
#include<iostream>
using namespace std;
int main()
{
char b='d';
int a=10;
char *c=new char[10];
c=&b;
int *e=&a;
cout<<"c: "<<c<<endl;
cout<<"e: "<<e;
}
There is a non-member overload operator<<(std::basic_ostream) for the const char* type, that doesn't write the address, but rather the (presumed) C-style string1). In your case, since you have assigned the address of a single character, there is no NUL terminator, and thus no valid C-style string. The code exhibits undefined behavior.
The behavior for int* is different, as there is no special handling for pointers to int, and the statement writes the address to the stream, as expected.
If you want to get the address of the character instead, use a static_cast:
std::cout << static_cast<void*>( c ) << std::endl;
1) A C-style string is a sequence of characters, terminated by a NUL character ('\0').
Actually this program has problem. There is a memory leak.
char *c=new char[10];
c=&b;
This allocates 10 characters on heap, but then the pointer to heap is overwritten with the address of the variable b.
When a char* is written to cout with operator<< then it is considered as a null terminated C-string. As the address of b was initialized to a single character containing d op<< continues to search on the stack finding the first null character. It seems the it was found after a few characters, so dh^# is written (the d is the value of variable b the rest is just some random characters found on the stack before the 1st \0 char).
If you want to get the address try to use static_cast<void*>(c).
My example:
int main() {
char *c;
char b = 'd';
c = &b;
cout << c << ", " << static_cast<void*>(c) << endl;
}
An the output:
dÌÿÿ, 0xffffcc07
See the strange characters after 'd'.
I hope this could help a bit!