How do you unindex a struct? - c++

How do I unindex a struct? Example:
typedef struct String_s {
int current_location;
int size;
char data[0];
} String;
char* String_getCString(String *str){
return &str->data[0];
}
//this is supposed to take the result of 'String_getCString' and reverse the process to get the String*
//i.e. String_getCString(CString_getString(str)) == str
String* CString_getString(char *str){
//???
}
int foo(char *cstr){
printf("%s\n", cstr);
fflush(0);
free(CString_getString(cstr));
}
int main(int argc, char *argv[]){
const char *hello_world = "hello world";
String *str = (String*)malloc(sizeof(String)+1000*sizeof(char));
str->size = 1000;
str->count = strlen(hello_world);
char *cstr = String_getCString(str);
strcpy(cstr, hello_world);
foo(cstr);
return 0;
}

I'm not 100% sure I understand what you want CString_getString to do, but if you want it to return the address of the overall String object when passed the address of the embedded data field, then that's straightforward, but dangerous:
#include <stddef.h>
String *CString_getString(char *str)
{
return (String *)(str - offsetof(String, data));
}
If the type of the field you wished to "unindex" were anything other than [signed/unsigned/] char, you would need to cast the input pointer to char * before the subtraction, as well as casting to the desired return type afterward.
This is dangerous because CString_getString has no way of knowing whether you've passed in a str that really is the embedded data field of a String object. If you get it wrong, the C compiler sits back and watches it blow up on you at runtime. But, arguably, this is no worse than anything else one does in C all the time, and this can be a useful technique. It is, for instance, heavily used in the guts of Linux: http://lxr.free-electrons.com/ident?i=container_of

Related

Why can we return char* from function?

Here is a piece of C++ code that shows some very peculiar behavior. Who can tell me why strB can print out the stuff?
char* strA()
{
char str[] = "hello word";
return str;
}
char* strB()
{
char* str = "hello word";
return str;
}
int main()
{
cout<<strA()<<endl;
cout<<strB()<<endl;
}
Why does strB() work?
A string literal (e.g. "a string literal") has static storage duration. That means its lifetime spans the duration of your program's execution. This can be done because the compiler knows every string literal that you are going to use in your program, hence it can store their data directly into the data section of the compiled executable (example: https://godbolt.org/z/7nErYe)
When you obtain a pointer to it, this pointer can be passed around freely (including being returned from a function) and dereferenced as the object it points to is always alive.
Why doesn't strA() work?
However, initializing an array of char from a string literal copies the content of the string literal. The created array is a different object from the original string literal. If such array is a local variable (i.e. has automatic storage duration), as in your strA(), then it is destroyed after the function returns.
When you return from strA(), since the return type is char* an "array-to-pointer-conversion" is performed, creating a pointer to the first element of the array. However, since the array is destroyed when the function returns, the pointer returned becomes invalid. You should not try to dereference such pointers (and avoid creating them in the first place).
String literals exist for the life of the program.
String literals have static storage duration, and thus exist in memory for the life of the program.
That means cout<<strB()<<endl; is fine, the returned pointer pointing to string literal "hello word" remains valid.
On the other hand, cout<<strA()<<endl; leads to UB. The returned pointer is pointing to the 1st element of the local array str; which is destroyed when strA() returns, left the returned pointer dangled.
BTW: String literals are of type const char[], char* str = "hello word"; is invalid since C++11 again. Change it to const char* str = "hello word";, and change the return type of strB() to const char* too.
String literals are not convertible or assignable to non-const CharT*. An explicit cast (e.g. const_cast) must be used if such conversion is wanted. (since C++11)
case 1:
#include <stdio.h>
char *strA() {
char str[] = "hello world";
return str;
}
int main(int argc, char **argv) {
puts(strA());
return 0;
}
The statement char str[] = "hello world"; is (probably) put on the stack when called, and expires once the function exits. If you are naïve enough to assume this is how it works on all target systems, you can write cute code like this, since the continuation is called ON TOP of the existing stack(so the data of the function still exists since it hasn't returned yet):
You can kinda cheat this with a continuation:
#include <stdio.h>
void strA(void (*continuation)(char *)) {
char str[] = "hello world";
continuation(str);
}
void myContinuation(char *arg) {
puts(arg);
}
int main(int argc, char **argv) {
strA(myContinuation);
return 0;
}
case 2:
If you use the snippet below, the literal "hello world" is usually stored in a protected read-only memory(trying to modify this string will cause a segmentation fault on many systems, this is similar to how your main, and strA are stored, c code is basically just a string of instructions/memory blob in the same way a string is a string of characters, but I digress), This string will be available to the program even if the function was never called if you just know the address it's suppose to be on the specific system. In the snippet below, the program prints the string without even calling the function, this will often work on the same platform, with a relatively same code and same compiler. It is considered undefined behavior though.
#include <stdio.h>
char *strB() {
char *str = "hello world";
return str;
}
int main(int argc, char **argv) {
char *myStr;
// comment the line below and replace it with
// result of &myStr[0], in my case, result of &myStr[0] is 4231168
printf("is your string: %s.\n", (char *)4231168);
myStr = strB();
printf("str is at: %lld\n", &myStr[0]);
return 0;
}
You can opt for a strC using structs and relative safety. This structure is created on the stack and FULLY returned. The return of strC is 81(an arbitrary number I made up for the structure, that I trust myself to respect) bytes in size.
#include <stdio.h>
typedef struct {
char data[81];
} MY_STRING;
MY_STRING strC() {
MY_STRING str = {"what year is this?"};
return str;
}
int main(int argc, char **argv) {
puts(strC().data);
printf("size of strC's return: %d.\n", sizeof(strC()));
return 0;
}
tldr; strB is likely corrupted by printf as soon as it returns from the function(since printf now has its' own stack), whereas string used in strA exists outside the function, it's basically a pointer to a global constant available as soon as program starts(the string is there in memory no different to how the code is in memory).

Simple serialization example in c++

I have the following struct:
typedef struct{
int test;
std::string name;
} test_struct;
Then, I have the following code in the main function:
int main(int argc, char *argv[]){
test_struct tstruct;
tstruct.test = 1;
tstruct.name = "asdfasdf";
char *testout;
int len;
testout = new char[sizeof(test_struct)];
memcpy(testout, &tstruct, sizeof(test_struct) );
std::cout<< testout;
}
However, nothing gets printed. What's wrong?
sizeof(std::string) yeilds same value always. It will not give you the runtime length of the string. To serialize using memcpy, either change the struct to contain char arrray such as char buffer[20] or compute the size of the required serialized buffer by defining a method on the struct which gives the runtime length of the bytes.
If you want to use members like std::string, you need to go through each member of the struct and serialize.
memcpy(testout, (void *)&tstruct.test, sizeof(int) );
memcpy(testout+sizeof(int), tstruct.name.c_str(),tstruct.name.length() );
memcpy against the entire struct will not work in such scenarios.
Try NULL-terminating the string and also emitting a newline:
testout = new char[sizeof(test_struct) + 1];
memcpy(testout, &tstruct, sizeof(test_struct));
testout[sizeof(test_struct)] = '\0';
std::cout<< testout << std::endl;
However, as user3543576 points out, the serialization you get from this process won't be too useful, as it will contain a memory address of a character buffer, and not the actual string itself.

C++ : Why I can't print a const char* with sprintf?

What am I missing here ? It's driving me nuts !
I have a function that returns a const char*
const char* Notation() const
{
char s[10];
int x=5;
sprintf(s, "%d", x);
return s;
}
Now in another part of the code I am doing this :
.....
.....
char str[50];
sprintf(str, "%s", Notation());
.....
.....
but str remains unchanged.
If instead I do this :
.....
.....
char str[50];
str[0]=0;
strcat(str, Notation());
.....
.....
str is correctly set.
I am wondering why sprintf doesn't work as expected...
You're trying to return an array allocated on stack and its behaviour is undefined.
const char* Notation() const
{
char s[10];
int x=5;
sprintf(s, "%d", x);
return s;
}
here s isn't going to be around after you've returned from the function Notation(). If you aren't concerned with thread safety you could make s static.
const char* Notation() const
{
static char s[10];
....
In both cases, it invokes undefined behavior, as Notation() returns a local array which gets destroyed on returning. You're unlucky that it works in one case, making you feel that it is correct.
The solution is to use std::string as:
std::string Notation() const
{
char s[10];
int x=5;
sprintf(s, "%d", x);
return s; //it is okay now, s gets converted into std::string
}
Or using C++ stream as:
std::string Notation() const
{
int x=5;
std::ostringstream oss;
oss << x;
return oss.str();
}
and then:
char str[50];
sprintf(str, "%s", Notation().c_str());
The benefit (and beauty) of std::ostringstream (and std::string) is that you don't have to know the size of output in advance, which means you don't have to use magic number such as 10 in array declaration char s[10]. These classes are safe in that sense.
char s[10] in Notation is placed on stack so it gets destroyed after exit from Notation function. Such variables are called automatic. You need to save your string in heap using new:
char *s = new char[10];
But you have to free this memory manually:
char str[50];
const char *nt = Notation();
sprintf(str, "%s", nt);
printf("%s", str);
delete[] nt;
If you really use C++ then use built-in string class like Nawaz suggested. If you somehow restricted to raw pointers then allocate buffer outside Notation and pass it as destanation parameter like in sprintf or strcat.

c++: write a char at a given char* causes segfault

I want to copy a char to an address where a given char* points to.
it's in a function which is called by main:
char data = " ";
myfunction(data, somethingelse);
...
inside the function i have something like
void myfunction(char* data, short somethingelse) {
...
char byte = 0;
inputfilestream.read(&byte, 1);
*data = byte; // here i get the segfault
data++;
...
}
the segfault also comes when i to the copy using strncpy:
strncpy(data, byte, 1);
why is there a segfault? data isn't const and the address where i actually write to is exactly the same as the one where i allocated the data-array. i've tested that multiple times.
thanks in advance.
String literals are readonly. If you want a modifyable string, you must use an array, e.g.:
char data[10];
Or:
char *data = new char[10];
To elaborate a bit more: the type of a string literal is actually const char*. Assigning a string literal to a non-const char* is therefore technically invalid, but most compilers allow it anyway for legacy reasons. Many modern compilers will at least issue a warning when you try to do that.
data is assigned a string literal. String literals are ready only, and writing to them will cause segfaults.
Try this:
char data[10]; // or whatever size you want.
instead.
why is there a segfault? data isn't const and the address where i actually write to is exactly the same as the one where i allocated the data-array.
You didn't allocate anything. char *data = " "; shouldn't even compile in C++. You are assigning a constant string to a non-constant.
char byte = 0;
inputfilestream.read(&byte, 1);
*data = byte; // here i get the segfault
data++; // << How many times?
No problem
#include <stdio.h>
int main(int argc, char **argv)
{
char *data = "Yello"; // or char data[] = "Yello";
*data = 'H';
puts(data); // Hello
}

how to copy char * into a string and vice-versa

If i pass a char * into a function. I want to then take that char * convert it to a std::string and once I get my result convert it back to char * from a std::string to show the result.
I don't know how to do this for conversion ( I am not talking const char * but just char *)
I am not sure how to manipulate the value of the pointer I send in.
so steps i need to do
take in a char *
convert it into a string.
take the result of that string and put it back in the form of a char *
return the result such that the value should be available outside the function and not get destroyed.
If possible can i see how it could be done via reference vs a pointer (whose address I pass in by value however I can still modify the value that pointer is pointing to. so even though the copy of the pointer address in the function gets destroyed i still see the changed value outside.
thanks!
Converting a char* to a std::string:
char* c = "Hello, world";
std::string s(c);
Converting a std::string to a char*:
std::string s = "Hello, world";
char* c = new char[s.length() + 1];
strcpy(c, s.c_str());
// and then later on, when you are done with the `char*`:
delete[] c;
I prefer to use a std::vector<char> instead of an actual char*; then you don't have to manage your own memory:
std::string s = "Hello, world";
std::vector<char> v(s.begin(), s.end());
v.push_back('\0'); // Make sure we are null-terminated
char* c = &v[0];
You need to watch how you handle the memory from the pointer you return, for example the code below will not work because the memory allocated in the std::string will be released when fn() exits.
const char* fn(const char*psz) {
std::string s(psz);
// do something with s
return s.c_str(); //BAD
}
One solution is to allocate the memory in the function and make sure the caller of the function releases it:
const char* fn(const char*psz) {
std::string s(psz);
// do something with s
char *ret = new char[s.size()]; //memory allocated
strcpy(ret, s.c_str());
return ret;
}
....
const char* p = fn("some text");
//do something with p
delete[] p;// release the array of chars
Alternatively, if you know an upper bound on the size of the string you can create it on the stack yourself and pass in a pointer, e.g.
void fn(const char*in size_t bufsize, char* out) {
std::string s(psz);
// do something with s
strcpy_s(out, bufsize, s.c_str()); //strcpy_s is a microsoft specific safe str copy
}
....
const int BUFSIZE = 100;
char str[BUFSIZE];
fn("some text", BUFSIZE, str);
//ok to use str (memory gets deleted when it goes out of scope)
You can maintain a garbage collector for your library implemented as
std::vector<char*> g_gc; which is accessible in your library 'lib'. Later, you can release all pointers in g_gc at your convenience by calling lib::release_garbage();
char* lib::func(char*pStr)
{
std::string str(pStr);
char *outStr = new char[str.size()+1];
strcpy(outStr, str.c_str());
g_gc.push_back(outStr); // collect garbage
return outStr;
}
release_garbage function will look like:
void lib::release_garbage()
{
for(int i=0;i<g_gc.size();i++)
{
delete g_gc[i];
}
g_gc.clear();
}
In a single threaded model, you can keep this g_gc static. Multi-threaded model would involve locking/unlocking it.