Storing const char* gives me random characters - c++

Okay so basically i have a struct like this
struct person{
const char* name;
const char* about_me;
const char* mom_name;
const char* age;
};
And then in order to make my code versatile i have
struct Person PersonAsArray[MAX_ARRAY - 1];
And then i have a file that reads in a bunch of stuff and eventually i parse it. but when i parse it i get a std::string so i gotta convert it to a const char* so heres some more of my code:
getline(file, line);
//break the line up into 2 parts (because in the file its "name=John")
//these two parts are called id and value
if(id == "name"){
const char* CCvalue = value.c_str();
cout << CCvalue << endl; // its fine here
PersonAsArray[i].name = CCvalue; //i is incremented each time i need a new struct
}
if(id == "age"){
PersonAsArray[i].age = atoi(value.c_str());
}
//and some more of this stuff... eventually i have
cout << PersonAsArray[0].name << endl;
cout << PersonAsArray[0].about_me << endl;
cout << PersonAsArray[0].mom_name << endl;
cout << PersonAsArray[0].age << endl;
but when i finally cout everything, i end up with something that looks like this. I'm just a little curious on whats going on and why its giving me symbols? and its not always the same symbols. Sometimes i get the smiley face, sometimes i dont even get the whole row of rectangles. I have no idea what im doing and its probably some major flaw in my coding. But this also happens when i do something like this
string hi = "hello"
for(i = 0; hi[i] != '\0'; i++){
char x = hi[i];
string done = "";
if(x == 'h') done += "abc";
if(x == 'e') done += "zxc";
if(x == 'l') done += "aer";
if(x == 'o') done += "hjg";
cout << done;
}
I think i remember getting these flower like shapes and i think i even saw chinese characters but again they were not consistent even if i didnt change anything in the program, if i ran it several times, i would see several different combination of symbols and sometimes no symbols would appear.

You did not read the documentation!
The value returned by std::string::c_str() does not live forever.
The pointer obtained from c_str() may be invalidated by:
Passing a non-const reference to the string to any standard library function, or
Calling non-const member functions on the string, excluding operator[], at(), front(), back(), begin(), rbegin(), end() and rend().
The destructor is one such "non-const member function".
Once the pointer is invalidated, you cannot use it. When you try, you either get the data stored at some arbitrary place in memory (your computer's futile attempts to make sense of that data, as if it were text, are resulting in the flowers and Chinese characters you describe) or other unpredictable, bizarre symptoms.
Unfortunately you did not present a complete, minimal testcase so we have no idea how value really fits into your code, but it's clear that it does not survive intact between your "its fine here" and your problematic code.
Don't store the result of std::string::c_str() long-term. There's no need to, and it's rarely useful to.
tl;dr Make person store std::strings, not dangling pointers.

The problem is that you have something like
{
std::string value;
// fill value
PersonsAsArray[i].name = value.c_str();
}
Now, value is a local variable which gets destroyed upon exiting the scope in which it is declared. You store the pointer to its internal data to a .name but you are not copying it so after destruction it points to garbage.
You should have a std::string name field instead that const char*, that will handle copying and retaining the content by itself and its copy assignment operator or allocate memory for the const char* manually, for example through strdup.

Related

"ReadProcessMemory" how to get std::string?

Programm_A
int main()
{
std::cout << "Process ID = " << GetCurrentProcessId() << "\n"; // Id my process (i get something like '37567')
std::string My_String = "JoJo"; // My string
std::cout << &My_String << std::endl; //here i get something like '0x0037ab7'
system("pause");
}
This program just outputs reference of string "JoJo" to console.
Programm_B
int main()
{
int id;
std::cin >> id;
DWORD ProcessId = id;
HANDLE ProcessHandle = OpenProcess(PROCESS_VM_READ, FALSE, ProcessId);
if (!ProcessHandle) {
std::cout << "Process is not found...\n";
return 0;
}
std::string r;
std::cin >> r; // this is me writing the link that I get in programs_A
DWORD address = std::strtoul(r.c_str(), NULL, 16);
std::string JoJo_string = " ";
ReadProcessMemory(ProcessHandle, (LPVOID)(address), &JoJo_string, sizeof(JoJo_string), 0); //here I want to get the JoJo_string value by reference from programm_A
std::cout << JoJo_string << std::endl;
}
The funny thing is that everything works fine with the "int" variable type. But std::string is not working. The exact value reads, but the program immediately gives an error:
[-- programm_B --]
[-- error --]
You can't easily read a std::string across process boundaries.
Different standard library implementations of std::string use different memory layouts for its data members. The std::string in your program may be using a different implementation than the program you are trying to read from.
But even if the two programs used the exact same implementation, it still wouldn't matter, because std::string is simply too complex to read with a single ReadProcessMemory() call. std::string uses dynamic memory for its character data, so one of its data members is a char* pointer to the data stored elsewhere in memory. Which is complicated by the fact that std::string might also implement a local Short-String Optimization buffer so short string values are stored directly in the std::string object itself to avoid dynamic memory allocations.
So, you would have to know the exact implementation of std::string being used by the target program in order to decipher its data members to discover where the character data is actually being stored, and in the case where SSO is not active then read the char* pointer, as well as the number of characters being pointed at (which itself is also determinate in an implementation-specific way), so you could then read the character data with another ReadProcessMemory() call.
In short, what you are attempting to do is a futile effort.
You most certainly CAN get a content of the std::string from another process. After all, the debuggers do that!
You need to execute data() and length() functions of that string and then read content of that memory. The important point is to halt execution of the second process while you are doing that, or that memory location can become invalid between your calls.

Why is my string variable not printing a full string?

Program:
#include <iostream>
using namespace std;
int main() {
string str;
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
cout << str;
return 0;
}
Output:
No output.
If I replace cout << str; with cout << str[1], I get a proper output.
Output:
b
And if I change the data type of the variable to a character array, I get the desired output. (Replacing string str; with char str[5];
Output:
abc
Why is my program behaving like this? How do I alter my code to get the desired output without changing the data type?
Your program has undefined behavior.
string str;
creates an empty string. It has length 0.
You are trying to write to the first three elements of this string with
str[0] = 'a';
str[1] = 'b';
str[2] = 'c';
These do not exist. Indexing a std::string out-of-bounds causes undefined behavior.
You can add characters to a string with any of the following methods:
str += 'a';
str += "a";
str.push_back('a');
str.append("a");
or you can resize the string first to the intended length before you index into any of its elements:
str.resize(3);
As pointed out by #Ayxan in a comment under this answer, you are also missing #include<string>. Without it it is unspecified whether your program will compile since it uses std::string which is defined in <string>. It is unspecified whether including one standard library header will include another one if there isn't a specific exception. You should not rely on unspecified behavior that may break at any point.
In addition to walnut's answer, here's what's going on under the hood:
An std::string contains at least two members, a data pointer (type char*) and a size. The [] operator does not check the size, it only indexes into the memory behind the data pointer. Also, the [] operator does not modify the size member of the string.
The << operator that streams the string to cout, however, does interpret the size member. It does so to be able to output strings containing null characters. Since the size member is still zero, nothing is printed.
You may wonder why the memory access within str[0] even succeeds, after all, the string never had any reason to allocate any memory for its data yet. This is due to the fact that virtually all std::string implementations use the small-string-optimization: The std::string object itself is a bit larger than it needs to be, and the space at its end is used instead of an allocation on the heap unless the string becomes longer than that space. As such, the default constructor will just point the data pointer to that internal storage to have it initialized, and your memory accesses are directed to existing memory. This is why you don't get a SEGFAULT unless you access the string's data way out of bounds. Doesn't change the fact that already your expression str[0] is undefined behavior, though. The symptoms may appear benign, but the disease is fatal.

String Reversal Memory Consumption Differences

Suppose I implement the following two string reversal algorithms:
void reverse(string &s) {
if(s.size() == 1) return;
string restOfS = s.substr(1);
reverse(restOfS);
s = restOfS + s.at(0);
}
string reverseString(string s) {
if(s.size() == 1) return s;
return reverseString(s.substr(1)) + s.at(0);
}
int main() {
string name = "Dominic Farolino";
reverse(name);
cout << name << endl;
name = reverseString(name);
cout << name << endl;
return 0;
}
One of these obviously modifies the string given to it, and one of returns a new string. Since the first one modifies the given string and uses a reference parameter as its mode of communication to the next recursive stack frame, I at first assumed this would be more efficient since using a reference parameter may help us not duplicate things in memory down the line, however I don't believe that's the case. Obviously we have to use a reference parameter with this void function, but it seems that we are undoing any memory efficiency using a reference parameter may give us since we since we are just declaring a new variable on the stack every time.
In short, it seems that the first one is making a copy of the reference every call, and the second one is making a copy of the value each call and just returning its result, making them of equal memory consumption.
To make the first one more memory efficient I feel like you'd have to do something like this:
void reverse(string &s) {
if(s.size() == 1) return;
reverse(s.substr(1));
s = s.substr(1) + s.at(0);
}
however the compiler won't let me:
error: invalid initialization of non-const reference of type 'std::string& {aka std::basic_string<char>&}' from an rvalue of type 'std::basic_string<char>'
6:6: note: in passing argument 1 of 'void reverse(std::string&)'
Is this analysis correct?
substr() returns a new string every time, complete with all the memory use that goes with that. So if you're going to do N-1 calls to substr(), that's O(N^2) extra memory you're using for no reason.
With std::string though, you can modify it in place, just by iterating over it with a simple for loop. Or just using std::reverse:
void reverseString(string &s) {
std::reverse(s.begin(), s.end());
}
Either way (for loop or algorithm) takes O(1) extra memory instead - it effectively is just a series of swaps, so you just need one extra char as the temporary. Much better.

Modifying the length and contents of the string?

To change the contents of a string in a function such that it reflects in the main function we need to accept the string as reference as indicated below.
Changing contents of a std::string with a function
But in the above code we are changing the size of string also(i.e, more than what it can hold), so why is the program not crashing ?
Program to convert decimal to binary, mind it, the code is not complete and I am just testing the 1st part of the code.
void dectobin(string & bin, int n)
{
int i=0;
while(n!=0)
{
bin[i++]= (n % 2) + '0';
n = n / 2;
}
cout << i << endl;
cout << bin.size() << endl;
cout << bin << endl;
}
int main()
{
string s = "1";
dectobin(s,55);
cout << s << endl;
return 0;
}
O/p: 6 1 1 and the program crashes in codeblocks. While the above code in the link works perfectly fine.
It only outputs the correct result, when i initialize the string in main with 6 characters(i.e, length of the number after it converts from decimal to binary).
http://www.cplusplus.com/reference/string/string/capacity/
Notice that this capacity does not suppose a limit on the length of the string. When this capacity is exhausted and more is needed, it is automatically expanded by the object (reallocating it storage space). The theoretical limit on the length of a string is given by member max_size
If the string resizes itself automatically then why do we need the resize function and then why is my decimal to binary code not working?
Your premise is wrong. You are thinking 1) if I access a string out of bound then my program will crash, 2) my program doesn't crash therefore I can't be accessing a string out of bounds, 3) therefore my apparently out of bounds string accesses must actually resize the string.
1) is incorrect. Accessing a string out of bounds results in undefined behaviour. This is means exactly what it says. Your program might crash but it might not, it's behaviour is undefined.
And it's a fact that accessing a string never changes it's size, that's why we have the resize function (and push_back etc.).
We must get questions like yours several times a week. Undefined behaviour is clearly a concept that newbies find surprising.
Check this link about std::string:
char& operator[] (size_t pos);
const char& operator[] (size_t pos) const;
If pos is not greater than the string length, the function never
throws exceptions (no-throw guarantee). Otherwise, it causes
undefined behavior.
In your while loop you are accessing the bin string with index that is greater than bin.size()
You aren't changing the size of the string anywhere. If the string you pass into the function is of length one and you access it at indices larger than 0, i.e., at bin[1], bin[2], you are not modifying the string but some other memory locations after the string - there might be something else stored there. Corrupting memory in this way does not necessarily directly lead to a crash or an exception. It will once you access those memory locations later on in your program.
Accepting a reference to a string makes it possible to change instances of strings from the calling code inside the called code:
void access(std::string & str) {
// str is the same instance as the function
// is called with.
// without the reference, a copy would be made,
// then there would be two distinct instances
}
// ...
std::string input = "test";
access(input);
// ...
So any function or operator that is called on a reference is effectively called on the referenced instance.
When, similar to your linked question, the code
str = " new contents";
is inside of the body of the access function, then operator= of the input instance is called.
This (copy assignment) operator is discarding the previous contents of the string, and then copying the characters of its argument into newly allocated storage, whose needed length is determined before.
On the other hand, when you have code like
str[1] = 'a';
inside the access function, then this calls operator[] on the input instance. This operator is only providing access to the underlying storage of the string, and not doing any resizing.
So your issues aren't related to the reference, but to misusing the index operator[]:
Calling that operator with an argument that's not less than the strings size/length leads to undefined behaviour.
To fix that, you could resize the string manually before using the index operator.
As a side note: IMO you should try to write your function in a more functional way:
std::string toOct(std::string const &);
That is, instead of modifying the oases string, create a new one.
The bounds of the string are limited by its current content. That is why when you initialise the string with 6 characters you will stay inside bounds for conversion of 55 to binary and program runs without error.
The automatic expansion feature of strings can be utilised using
std::string::operator+=
to append characters at the end of current string. Changed code snippet will look like this:
void dectobin(string & bin, int n){
//...
bin += (n % 2) + '0';
//...
}
Plus you don't need to initialise the original string in main() and your program should now run for arbitrary decimals as well.
int main(){
//...
string s;
dectobin(s,55);
//...
}

Comparing received string on server side - C++

I followed this tutorial (http://codebase.eu/tutorial/linux-socket-programming-c/) and made a server. The thing is that when the server receives a string from the client, I don't know how to compare it. For example, the following doesn't work:
bytes_received = recv(new_sd, incomming_data_buffer, 1000, 0);
if(bytes_received == 0)
cout << "host shut down." << endl;
if(bytes_received == -1)
cout << "receive error!" << endl;
incomming_data_buffer[bytes_received] = '\0';
cout << "Received data: " << incomming_data_buffer << endl;
//The comparison in the if below doesn't work. The if isn't entered
//if the client sent "Hi", which should work
if(incomming_data_buffer == "Hi\n")
{
cout << "It said Hi!" << endl;
}
You are attempting to compare a character pointer with a string literal (which will resolve to a character pointer), so yeah, the code you have definitely won't work (nor should it). Since your in C++, I would suggest this:
if(std::string(incomming_data_buffer) == "Hi\n")
cout<<"It said Hi!"<<endl;
Now, you need to include string for this work, but I assume you are already doing this, especially if you are comparing strings using this method other places in your code.
Just an explanation of what's going on here, since you appear to be relatively new to C++. In C, string literals are stored as const char*, and mutable strings are simply character arrays. If you've ever programmed C, you might remember that (char* == char*) doesn't actually compare strings, you would need the strcmp() function for that.
C++, however, introduces the std::string type, which can be directly compared using the '==' operator (and concatenated using the '+' operator). But, C code still runs in C++, so char* arrays are not necessarily promoted to std::string unless they are being operated on by a std::string operator (and even then, if I recall, they aren't so much promoted as the operator allows string/char* comparisons), so (std::string == char*) will perform the expected comparison operation. When we do std::string(char*), we call the std::string constructor, which returns a string (in this case, a temporary one) that is compared with your string literal.
Note that I am assuming incomming_data_buffer is of type char*, you are using it like it is, although I can't see the actual declaration.