Binary file read error - c++

Run no problem for the first time,but when I comment on a part of the code,Program termination:

The problem is that you are trying to read objects you can't really save to files, or load from files.
Lets take that std::string member name. A std::string objects is basically just a pointer to a dynamically allocated array of characters (i.e. a C-style zero-terminated string), plus the length of the contained string. The problem is two-fold: First is that when attempting to save the name object it doesn't save the string, but the pointer; And the second problem is that pointers to dynamically allocated data are unique per process.
What happens when you load the object is that you read and set the pointer, but only the pointer. This pointer was valid in the process that wrote the object, but not in the current process, it doesn't point to any valid memory allocated by your process. Using this pointer, which is done when you use the string object, will then lead to undefined behavior, and UB us one of the most common reasons for crashes.
What you need to do is to serialize the string. If you want to write the code yourself and not use a library for it (there are many great serialization libraries you can use) then you need to write the string length as a fixed-sized integer, then write the actual string data. When you deserialize you first must know that the next piece of data to read is a string, and then read the length, followed by the string data, and then construct your string object from that.

Related

Arduino C++: QueueArray does not store String Objects?

is there a specific catch with storing Arduino String objects into a QueueArray?
when I try the following code, Arduino just stops executing at "enqueue" function.
QueueArray <String> q;
String s = "blah";
q.enqueue(s);
Serial.println("checkpoint"); delay(1000);
Serial.println(q.peek()); delay(1000);
Same code works for storing integers, and even (char *). what am I missing?
by inspecting the headerfile (only template functions, so also source):
http://playground.arduino.cc/uploads/Code/QueueArray.zip
I believe the constructor of the queue gets you into this trouble.
I have had troubles allocating objects in the queue.
I have decided to only put the references in there as a number, and then on the recieving end set that as the address value of a pointer of the object type.
This is btw. a risky strategy!
why not just use enqueue the c_str attribute https://www.arduino.cc/en/Reference/CStr ?
Edit: this answer to a comment would bee too long:
The Arduino is a µProcessor and have a very limited memory (~2K).
(32K for your program, and similar in flash).
Remember that it is also a machine with a limited memory management.
so stack and heaps are both very small, and in general when the heap is used, it can get really fragmented very quickly.
As a C++ programmer that looks at your code, one might expect that the string is in fact stack allocated. (hint: there is no new keyword in there) so even if it internally would hold characters in the heap, this should allow you to assume that the string object would be destroyed when you exit the scope.
(you should also expect that it would clean up after it self, and deallocate the dynamic memory used for characters. Depending on your version of the libraries, this does in fact work, or there is a problem with re-allocate / deallocate, making stuff worse than we need to discuss here..)
learn the difference between heap and stack:
C++ Object Instantiation
you can do something with arduino strings minimize the ehap fragmentation, like reserving memory before doing string operations.
But every time you decide to use a string, you will most likely fragment the heap, especially if you let it stay alive after scope is exited (for use on the other end of the queue).
To avoid the fragmentation issues, (remember it's a small system), you could use enums or similar for predefined messages. However if you insist that you actually need to enqueue a string I have a better suggestion.
You could create a small global array of strings, to hold the strings that will be enquede. (whis will ofcourse limit the queue size, since no more messages can be in the queue, than the array of string objects allow..
This array will be stack allocated, but the characters referenced by each string would be heap allocated.
This allows you to let the reader of the queue clear the string, and thus free memory on the heap.
However as the array never leaves scope, the strings are never automatically deleted by the sender.
You would have to actively clear the string after recieving the string.
For this solution, there will be some constant overhead on the stack.
Alternatively (and in my opinion much better, since fragmentation is non-existing), you could use something like a global character buffer, reserved to allocating messages that is being transferred.
Before enqueueing, append the new message (c-styled null terminated string) after the last one currently in the buffer. The queue should contain the pointer to the message in the char array.
You must always test if there is room in the buffer for the string, before allocating some new message.
The way I do this is by having a global "recieved" pointer, that the worker will update when it reads the message.
It simply moves the pointer to the strings ending null character, when it's done treating it.
The producer will then have another local pointer used for remembering where it wrote the last time it wrote something.
Since the recieved pointer is global, the producer can always calculate the "distance" available between the pointers, and know how many characters can be written to the message buffer..
This is a simple circular buffer, so you need to handle overflowing etc. by adding a read and a write method.

Want to know the length of a buffer C++

I am new to C++ and programming and I would like to know if there is a way to get the length of a pointer.
Let's say Myheader is a struct with different types of data inside.
My code goes like this:
char *pStartBuffer;
memcpy(pStartBuffer, &MyHeader, MyHeader.u32Size);
So I want to know the length of the buffer so that I can copy the data to a file using QT write function.
file.write(pStartBuffer, length(pStartBuffer));
How can I do this?
Pointers don't know its allocated size.
You may use std::vector which keep track of size for you:
std::vector<char> pStartBuffer(MyHeader.u32Size);
memcpy(pStartBuffer, &MyHeader, MyHeader.u32Size);
And latter:
file.write(pStartBuffer.data(), pStartBuffer.size());
There is no way to find the length of a buffer given nothing but a pointer. If you are certain that it's a string you can use one of the string length functions, or you can keep track of the length of the buffer yourself.
Pointer doesn't have a "length". What you need is the length of the array which the pointer points to.
No, you cannot extract that information from the pointer.
If the array contains a valid, null-terminated character string, then you can get the length of that string by iterating it until you find a null character which is what strlen does.
If not, then what you normally do is you store the length in a variable when you allocate the array. Which is one of the things that std::vector or std::string will do for you, whichever is more appropriate for your use.

(Why) does an empty string have an address?

I guessed no, but this output of something like this shows it does
string s="";
cout<<&s;
what is the point of having empty string with an address ?
Do you think that should not cost any memory at all ?
Yes, every variable that you keep in memory has an address. As for what the "point" is, there may be several:
Your (literal) string is not actually "empty", it contains a single '\0' character. The std::string object that is created to contain it may allocate its own character buffer for holding this data, so it is not necessarily empty either.
If you are using a language in which strings are mutable (as is the case in C++), then there is no guarantee that an empty string will remain empty.
In an object-oriented language, a string instance with no data associated with it can still be used to call various instance methods on the string class. This requires a valid object instance in memory.
There is a difference between an empty string and a null string. Sometimes the distinction can be important.
And yes, I very much agree with the implementation of the language that an "empty" variable should still exist in and consume memory. In an object-oriented language an instance of an object is more than just the data that it stores, and there's nothing wrong with having an instance of an object that is not currently storing any actual data.
Following your logic, int i; would also not allocate any memory space, since you are not assigning any value to it. But how is it possible then, that this subsequent operation i = 10; works after that?
When you declare a variable, you are actually allocating memory space of a certain size (depending on the variable's type) to store something. If you want to use this space right way or not is up to you, but the declaration of the variable is what triggers memory allocation for it.
Some coding practices say you shouldn't declare a variable until the moment you need to use it.
An 'empty' string object is still an object - there may be more to its internal implementation than just the memory required to store the literal string itself. Besides that, most C-style strings (like the ones used in C++) are null-terminated, meaning even that "empty" string still uses one byte for the terminator.
Every named object in C++ has an address. There is even a specific requirement that the size of every type be at least 1 so that T[N] and T[N+1] are different, or so that in T a, b; both variables have distinct addresses.
In your case, s is a named object of type std::string, so it has an address. The fact that you constructed s from a particular value is immaterial. What matters is that s has been constructed, so it is an object, so it has an address.
s is a string object so it has an address. It has some internal data structures keeping track of the string. For example, current length of the string, current storage reserved for string, etc.
More generally, the C++ standard requires all objects to have a nonzero size. This helps ensure that every object has a unique address.
9 Classes
Complete objects and member subobjects of class type shall have nonzero size.
In C++, all classes are a specific, unchanging size. (varying by compiler and library, but specific at compile-time.) The std::string usually consists of a pointer, a length of allocation, and a length used. That's ~12 bytes, no matter how long the string is, and you have allocated std::string s on the call stack. When you display the address of the std::string, cout displays the location of the std::string in memory.
If the string doesn't point at anything, it won't allocate any space from the heap, which is like what you're thinking. But, all c-strings end in a trailing NULL, so the c-string "" is one character long, not zero. This means when you assign the c-string "" to the std::string, the std::string allocates 1 (or more) bytes, and assigns it the value of the trailing NULL character (usually zero '\0').
If there truly was no point to the empty string, then the programmer would not write the instruction at all. The language is loyal and trusting! And will never assume memory you allocate to be "wasted". Even if you are lost and heading over a cliff, it will hold your hand to the bitter end.
I think it'd be interesting to know, just as a curiosity though, that if you create a variable that isn't 'used' later, such as your empty string, the compiler may very well optimize it away so it incurs no cost to begin with. I guess compilers aren't as trusting...

However convert a memory to a byte array?

Now I have a database, which one field type is an array of byte.
Now I have a piece of memory, or an object. How to convert this piece of memory or even an object to a byte array and so that I can store the byte array to the database.
Suppose the object is
Foo foo
The memory is
buf (actually, don't know how to declare it yet)
The database field is
byte data[256]
Only hex value like x'1' can be insert into the field.
Thanks so much!
There are two methods.
One is simple but has serious limitations. You can write the memory image of the Foo object. The drawback is that if you ever change the compiler or the structure of Foo then all your data may no longer loadable (because the image no longer matches the object). To do this simply use
&Foo
as the byte array.
The other method is called 'serialization'. It can be used if the object changes
but adds a lot of space to encode the information. If you only have 256 bytes then you
need to be watchful serialization doesn't create a string too large to save.
Boost has a serialization library you may want to look at, though you'll need to careful about the size of the objects created. If you're only doing this with a small set of classes, you may want to write the marshalling and unmarshalling functions yourself.
From the documentation:
"Here, we use the term "serialization" to mean the reversible deconstruction of an arbitrary set of C++ data structures to a sequence of bytes. "

Access Violation Using memcpy or Assignment to an Array in a Struct

Update 2:
Well I’ve refactored the work-around that I have into a separate function. This way, while it’s still not ideal (especially since I have to free outside the function the memory that is allocated inside the function), it does afford the ability to use it a little more generally. I’m still hoping for a more optimal and elegant solution…
Update:
Okay, so the reason for the problem has been established, but I’m still at a loss for a solution.
I am trying to figure out an (easy/effective) way to modify a few bytes of an array in a struct. My current work-around of dynamically allocating a buffer of equal size, copying the array, making the changes to the buffer, using the buffer in place of the array, then releasing the buffer seems excessive and less-than optimal. If I have to do it this way, I may as well just put two arrays in the struct and initialize them both to the same data, making the changes in the second. My goal is to reduce both the memory footprint (store just the differences between the original and modified arrays), and the amount of manual work (automatically patch the array).
Original post:
I wrote a program last night that worked just fine but when I refactored it today to make it more extensible, I ended up with a problem.
The original version had a hard-coded array of bytes. After some processing, some bytes were written into the array and then some more processing was done.
To avoid hard-coding the pattern, I put the array in a structure so that I could add some related data and create an array of them. However now, I cannot write to the array in the structure. Here’s a pseudo-code example:
main() {
char pattern[]="\x32\x33\x12\x13\xba\xbb";
PrintData(pattern);
pattern[2]='\x65';
PrintData(pattern);
}
That one works but this one does not:
struct ENTRY {
char* pattern;
int somenum;
};
main() {
ENTRY Entries[] = {
{"\x32\x33\x12\x13\xba\xbb\x9a\xbc", 44}
, {"\x12\x34\x56\x78", 555}
};
PrintData(Entries[0].pattern);
Entries[0].pattern[2]='\x65'; //0xC0000005 exception!!! :(
PrintData(Entries[0].pattern);
}
The second version causes an access violation exception on the assignment. I’m sure it’s because the second version allocates memory differently, but I’m starting to get a headache trying to figure out what’s what or how to get fix this. (I’m currently working around it by dynamically allocating a buffer of the same size as the pattern array, copying the pattern to the new buffer, making the changes to the buffer, using the buffer in the place of the pattern array, and then trying to remember to free the—temporary—buffer.)
(Specifically, the original version cast the pattern array—+offset—to a DWORD* and assigned a DWORD constant to it to overwrite the four target bytes. The new version cannot do that since the length of the source is unknown—may not be four bytes—so it uses memcpy instead. I’ve checked and re-checked and have made sure that the pointers to memcpy are correct, but I still get an access violation. I use memcpy instead of str(n)cpy because I am using plain chars (as an array of bytes), not Unicode chars and ignoring the null-terminator. Using an assignment as above causes the same problem.)
Any ideas?
It is illegal to attempt to modify string literals. Your
Entries[0].pattern[2]='\x65';
line attempts exactly that. In your second example you are not allocating any memory for the strings. Instead, you are making your pointers (in the struct objects) to point directly at string literals. And string literals are not modifiable.
This question gets asked several times every day. Read Why is this string reversal C code causing a segmentation fault? for more details.
The problem boils down to the fact that a char[] is not a char*, even if the char[] acts a lot like a char* in expressions.
Other answers have addressed the reason for the error: you're modifying a string literal which is not allowed.
This question is tagged C++ so the easy way to solve your problem is to use std::string.
struct ENTRY {
std::string pattern;
int somenum;
};
Based on your updates, your real problem is this: You want to know how to initialize the strings in your array of structs in such a way that they're editable. (The problem has nothing to do with what happens after the array of structs is created -- as you show with your example code, editing the strings is easy enough if they're initialized correctly.)
The following code sample shows how to do this:
// Allocate the memory for the strings, on the stack so they'll be editable, and
// initialize them:
char ptn1[] = "\x32\x33\x12\x13\xba\xbb\x9a\xbc";
char ptn2[] = "\x12\x34\x56\x78";
// Now, initialize the structs with their char* pointers pointing at the editable
// strings:
ENTRY Entries[] = {
{ptn1, 44}
, {ptn2, 555}
};
That should work fine. However, note that the memory for the strings is on the stack, and thus will go away if you leave the current scope. That's not a problem if Entries is on the stack too (as it is in this example), of course, since it will go away at the same time.
Some Q/A on this:
Q: Why can't we initialize the strings in the array-of-structs initialization? A: Because the strings themselves are not in the structs, and initializing the array only allocates the memory for the array itself, not for things it points to.
Q: Can we include the strings in the structs, then? A: No; the structs have to have a constant size, and the strings don't have constant size.
Q: This does save memory over having a string literal and then malloc'ing storage and copying the string literal into it, thus resulting in two copies of the string, right? A: Probably not. When you write
char pattern[] = "\x12\x34\x56\x78";
what happens is that that literal value gets embedded in your compiled code (just like a string literal, basically), and then when that line is executed, the memory is allocated on the stack and the value from the code is copied into that memory. So you end up with two copies regardless -- the non-editable version in the source code (which has to be there because it's where the initial value comes from), and the editable version elsewhere in memory. This is really mostly about what's simple in the source code, and a little bit about helping the compiler optimize the instructions it uses to do the copying.