Resizing struct / char array (to reduce memory usage) - c++

This is my first project on Arduino/C++/ESP32. I wrote a fairly big program and got almost everything working - except that in the end I realized that the device would run out of breath (memory) periodically and go for a reboot. The reboot is because I configured a watchdog to do so.
There is one area where I think there's a chance to reduce the memory usage but my experience on c++ is "not there yet" for me to be able to write this by myself. Any pointers (no pun intended) please? I have been on this since yesterday and getting rid of one error only results in another new error popping up. Moreover I don't want to come up with something that is hacky or might break later. It should be a quick answer for the experienced people here.
Let me explain the code that I prefer to refactor/optimize.
I need to store a bunch of records that I would need to read/manipulate later. I declared a struct (because they are related fields) globally. Now the issue is that I may need to store 1 record, 2 records or 5 records which I would only know later once I read the data from the EEPROM. And this has to be accessible to all the functions so it has to be a global declaration.
To summarize
Question 1 - how to set "NumOfrecs" later in the program once the data is read from the eeprom.
Question 2 - The size(sizeOfUsername) of the char array username can also change depending upon the length of the username read from the eeprom. At times it might be 5 characters long, at times it could be 25. I can set it to a max 25 and solve this problem but then wouldn't I be wasting memory if many usernames were just 4-5 characters long? So in short - just before copying over the data in eeprom into the "username" char array, is it possible to set it's size to the optimal size required for holding that data ( which is the data size + 1 byte for null termination ).
struct stUSRREC {
char username[sizeOfUsername];
bool online;
};
stUSRREC userRecords[NumOfrecs];
I familiarized myself with a whole bunch of functions like strcpy, memset, malloc etc but now I have run out of time and need to keep the learning part for another day.
I can try to do this in a slightly different manner where I don't use the struct and instead use individual char arrays ( for each field like username ). But then again I'll have to resize the arrays as I read the data from the eeprom.
I can explain all the things I have tried but that will make this question unnecessarily long and perhaps result in losing some clarity. Greatly appreciate any help.
While responding to Q&A on SO I was trying some random stuff and at least this little piece of code below seems to work ( in terms of storing smaller/bigger values )
struct stUSRREC {
char username[];
bool online;
};
stUSRREC userRecords[5];
Then manipulate it this way
strcpy(userRecords[0].username, "MYUSERNAME");
strcpy(userRecords[0].username, "test");
strcpy(userRecords[0].username, "MYVERYBIGUSERNAME");
I have been able to write/rewrite different lengths (above) and can read all of them back correctly. Resizing "userRecords" might be a different game but that can wait a little
One thing I forgot to mention was that I will need to size/resize the array ( holding username ) ONLY ONCE. In the setup() itself I can read/load the required data into those arrays. I am not sure if that opens up any other possibility. The rest of the struct/array I need to manipulate during the running are only boolean and int values. This is not an issue at all because there is no resizing required to do so.
On a side note I am pretty sure I am not the only one who faced this situation. Any tips/clues/pointers could be of help to many others. The constraints on little devices like ESP32 become more visible when you really start loading them with a bunch of things. I had it all working with "Strings" (the capital S) but the periodic reboot (cpu starvation?) required me to get rid of the Strings. Even otherwise I hear that using Strings (on ESP, Arduino and gang) is a bad idea.

You tagged this question as C++, so I'll ask:
Can you use vector and string in your embedded code?
#include <string>
#include <vector>
struct stUSRREC {
std::string username;
bool online;
stUSRREC(const char* name, bool isOnline) :
username(name),
online(isOnline)
{
}
};
std::vector<stUSRREC> userRecords;
The use of string as the username type means you only allocate as many characters needed to hold the name instead of allocated an assumed max size of sizeOfUsername. The use of vector allows you to dynamically grow your record set.
Then to add a new record:
stUSRREC record("bob", true);
userRecords.push_back(record);
And you may not need NumOfrecs anymore. That's covered by userRecrods.size()

Related

Is it legal to use #define in order to define the size of a static array?

I have many classes in the system that I'm currently developing and in these classes I have an array about the "name" of something. The name should be at most 30 characters.
Initially I used just 10 characters but now I need to increase the limit. Increasing the limit takes time though because I use this kind of array in many places. It would be easier if I used #define NAME_SIZE 30 or something like that and then all I would have to do is change one number instead of around twenty.
However I'm not sure if that's a "legal" thing to do in C++.
It would save me tons of time in the future, that's why I'm asking.
Yes, there is nothing technically wrong with it, except that #define is usually inferior to a const std::size_t MAX_NAME_SIZE = 30; Even better would be to have a dynamic size, e.g. using std::string.
Scott Meyers has an interesting column about systems that use gratuitous fixed sizes, called The Keyhole Problem
The Keyhole Problem arises every time software artificially restricts
something you want to see or something you want to express. If you
want to see an image, but your image-viewing software artificially
restricts how much of that image you can see at a time, that’s the
keyhole problem. If you want to specify a password of a particular
length, but your software says it’s too long, that’s the keyhole
problem. If you want to type in your U.S. telephone number, but your
software refuses to let you punctuate it in the conventional manner
with a dash between the three-digit prefix and the four-digit
exchange, that’s the keyhole problem.
Apart from annoynance from users, you also open your systems to all sorts of security issues (e.g. buffer overflow exploits).
Yes, it's legal. But it's generally preferable to use actual constants instead of macros:
const int max = 30;
char blah[max];
Another alternative is to use a std::string and not have an hard-coded limit (following the zero-one-infinity rule).
Yes, this is legal, but you might want to use a const int NAME_SIZE = 30;. This can be safely put in a header file. Unlike non-const global variables, having const variables in different translation units (cpp files) creates no problems for linker, since each constant is local to a file it's defined in.

Write a C++ struct to a file and read file using another programming language?

I have a challenging situation; we will have programs on Mac, PC, iOS and Android receiving files in a legacy format and parsing data from those files. We cannot change how those files are created.
The files are produced by a C++ program filling a struct with numbers and Strings and then writing it out. Here's a sanitized version.
struct MyObject {
String Kfkj(MAXKYS);
String Oern(MAXKYS);
String Vdflj(MAXKYS, 9);
int Muic;
int Tdfkj;
int VdfkAsdk;
int SsdjsdDsldsk;
int Ndsoief;
String TdflsajPdlj;
String TdckjdfPas;
String AdsfakjIdd;
int IdkfjdKasdkj;
int AsadkjaKadkja(MAXKYS);
int Kasldsdkj;
bool Usadl;
String PsadkjOasdj(9);
String PasdkjOsdkj;
};
Primitives and Strings, as you can see.
Then here is how they write it out to a file:
MyInstance MyObject;
FileName = "C:\MyFile.ab2"
ofstream fout (FileName, ios::binary);
fout.write((char*)& MyInstance, sizeof(MyInstance));
There is no option for us to translate it once and then distribute the file to other platforms; we must translate it on each and every different platform, and this is what we have to work with. I'd appreciate any information on how C++ serializes data, so we know how to parse the file.
EDIT: solution
The feedback I received from multiple answers here was VERY helpful. Using that, I did extensive analysis with hex editors and discovered:
the elements come in the file one after another
a "String," in this case, starts with an int describing how many characters follow the int for that String. If the String does not exist, it will still have that int with a value of 0.
integers, for the files and machines I saw, are two bytes, little-endian, and MOSTLY unsigned (there were a few that were signed, just to keep me on my toes)
the boolean was two bytes, with apparently -1 (FF FF) representing "true"
So far we have not ran into issues with different padding or endianness on different devices, but those are very real concerns. The skilled notes and warnings in these answers provides us with more ammunition to try to convince the client to change to a less fragile alternative, such as XML or JSON, for transferring data online across platforms.
As for those of you asking if the developer was fired... well, let's just say their code is very old, but after multiple conversations we're still having trouble convincing them writing out the C++ struct and trying to read that on different platforms is not a good idea.
You're going to run into many problems.
C++ doesn't have a specific format for serializing data per se. It is highly dependent on the computer architecture/processor that you are running on.
The compiler is allowed to add padding to help alignment on systems. When we say alignment we basically are referring to an architecture/processor's affinity for having data lie on specific byte boundaries. For example, some processors vastly prefer floating point numbers to lie at 4 or 8 byte boundaries - if they don't the processor may work much slower or may not work at all.
So, you can't simply know what padding your system is adding magically.
What you can do is use #pragma pack(1) / #pragma pack(0) to stop your compiler from padding your numbers.
PS: you also have to worry about endianness. What if one computer is running on big-endian and one is little endian? They will interpret bytes differently without a conversion.
Simply put, you either have to fix the application generating the files so it uses a proper serialization scheme OR you need to look at it running on a SPECIFIC computer, look at exactly how it writes the files, and write a translator for every target platform (which is just silly).
Interesting Suggestion
If you're really stuck, write an app that monitors the folder where you write files. Have the app pick up the files (since it's on the same PC it'll be able to read their format without issue). Have it write the files back in XML or some other true serialization format and distribute those instead.
Whoa - that's crazy. So String objects don't contain any pointers? Must not- because you claim this is working code.
Anyway, that code isn't doing any serialization. Its just writing the structure out to file exactly the way it is laid out in memory. The only issue you have is that on some platforms padding and sizeof integral types like int may be different.
You'll have to find the size of the integral types, and use that information in reader/writer for newer platforms to make sure they get laid out the same way on the legacy platform.
You're running a real risk with that code though. As it is, a compiler change could suddenly cause the file layout to change.
The format of your data file is entirely down to the compiler that your C++ program is compiled with, and the definition of your String class. You can rely on the fields being in the order they're declared in, and in this case, I think you can rely on there not being any padding at the start, but that's about all. Some tips that might help you out in this case:-
You don't give the definition of the String class you're using. If it's a typedef for std::string, you're completely screwed, because the contents of the string aren't in the memory. I assume your C++ programmers are using some special local buffer, in which case I'll guess you will find the first bytes of the object are the string, and there is some amount of useless padding afterwards. I hope the struct contains an int at the start telling you how much data in it is useful.
You'll probably find the int fields are four bytes long.
You'll probably find the bool field is one byte long, followed by three bytes of useless padding. Only one bit, most likely the bottom bit, will be set.
That's about all the useful guesswork I can offer you. In your target language, make sure to read the whole file in as the closest thing to a byte array available in the language, and only after that, use the language features to convert it into the right kind of thing in your language. Don't try reading it in as integers, as that won't let you byte-swap if you're on a platform with different endianness to the C++ program. I suggest also looking through the file in a text editor to reverse-engineer it and help you find the offset of each field.
Last piece of advice: consider printing P45s (or pink slips, or whatever you have in your country) for whichever programmers or project managers thought this kind of 'serialization' was a good idea. This kind of sloppy work might have been acceptable in a life-or-death situation, but they have seriously screwed you over in a way you're going to find it very hard to recover from. Writing the code to read in these files will not be that hard, if it's only one struct like this, but keeping it reliable will be a world of pain, and they've effectively made it impossible for themselves to change compilers or compiler version safely.
The way it's done, the struct is written in raw form to a file. So basically what you need to know to parse this file is the binary layout of your struct.
Basically, the fields are just one after the other, so to read an int, you just read 4 bytes and cast that to an int, etc.
Strings are a particular case. It's not clear from your code whether this "String" type is an inline array of characters, or a pointer to such an array. In the first case, you need to know how many characters each string contains and simply read that number of characters sequentially. In the second case, you won't be able to get the string back, since it won't have been written to file. The pointer will be useless to you.
One last concern is whether the struct is packed or not. Since you gave no indication to that, by default struct fields are aligned to 4-bytes boundaries, so there may be space for instance after the boolean field that you need to account for. If the struct is packed, then each field comes directly after the previous.
So, to make a long story short, figure out your struct binary layout using its definition and, if all else fails, inspecting the memory at run-time with the debugger, or use a hex editor to study the output file. Then write that specification down somewhere and this will give you what you need to read from the file. It's impossible to tell exactly what that layout is simply by looking at the pseudo-definition you gave.
Writing in an ofstream does not serialize data. This code write the raw memory content of the struct as it was a string of char. Depending of your compiler, its version, its options and the system it is running on the content will be completely different.
Even the number of bits of a char is allowed to change between c++ implementation.
Data referenced by the object of the struct won't be written (forget the content of std::string).
If you cannot change the writer code. You must know the alignment policy, the size of base type and the data representation. You will have to analyze files produced by hand, for example with an hexadecimal editor like this one
http://www.physics.ohio-state.edu/~prewett/hexedit/
, and probably look at your compiler documentation.
If you can change the writer code. Use proper serialization like json, protocol buffer or simply xml.
No one has pointed out something that sticks out to me as particularly problematic (maybe because I've been bit by it). That problem: the data member bool Usadl;. sizeof(bool) varies across platforms, across compilers, and even across releases of the same compiler. Common values for sizeof(bool) are 4 and 1. This will bite you. It's getting hard to find a big endian machine nowadays, very, very hard to find a computer where CHAR_BIT is not 8 or sizeof(int) is not 4. This is not the case for sizeof(bool).
In agreement with everyone else, Chad's team needs to document the structure of the records in the file, and then make sure the program that produces the file writes this structure explicitly, including element sizes, padding, and endianness. Don't depend on class layout to do this for you. That's just asking for trouble.
The best way would probably be to use JSON or if you want a more robust solution go with something like Avro. Avro has a C++ API and a Java API, so it covers most of the cases you're encountering.

Struggling with sprintf... something stupid?

Sorry to pester everyone, but this has been causing me some pain. Here's the code:
char buf[500];
sprintf(buf,"D:\\Important\\Calibration\\Results\\model_%i.xml",mEstimatingModelID);
mEstimatingModelID is an integer, currently holding value 0.
Simple enough, but debugging shows this is happening:
0x0795f630 "n\Results\model_0.xml"
I.e. it's missing the start of the string.
Any ideas? This is simple stuff, but I can't figure it out.
Thanks!
In an effort to make this an actual general answer: Here's a checklist for similar errors:
Never trust what you see in release mode, especially local variables that have been allocated from stack memory. Static variables that exist in heap data are about the only thing that will generally be correct but even then, don't trust it. (Which was the case for the user above)
It's been my experience that the more recent versions of VS have less reliable release mode data (probably b/c they optimize much more in release, or maybe it's 64bitness or whatever)
Always verify that you are examining the variable in the correct function. It is very easy to have a variable named "buf" in a higher function that has some uninitialized garbage in it. This would be easily confused with the same named variable in the lower subroutine/function.
It's always a good idea to double check for buffer overruns. If you ever use a %s in your sprintf, you could get a buffer overrun.
Check your types. sprintf is pretty adaptable and you can easily get a non-crashing but strange result by passing in a string pointer when an int is expected etc.

prepend and remove from a ( void * ) in C

i think this is a pretty straight forward problem , but i still can not figure it out .
I have function which sends stream over the network . naturally , this takes const void * as argument:
void network_send(const void* data, long data_length)
i am trying to prepend a specific header in the form of char* to this before sending it out over the socket:
long sent_size = strlen(header)+data_length;
data_to_send = malloc(sent_size);
memcpy(data_to_send,header,strlen(header)); /*first copy the header*/
memcpy((char*)data_to_send+strlen(header),data,dat_length); /*now copy the actual data*/
This works fine as long as the data is actually char* . but if it changes to some other data type , then this stops working .
when receiving , i need to remove the header from the data before processing it . so this is how it do it:
void network_data_received(const void* data, long data_length)
{
........
memmove(data_from_network,(char*)data_from_network + strlen(header),data_length); /*move the data to the beginning of the array*/
ProcessFurther(data_from_network ,data_length - strlen(header)) /*data_length - strlen(header) causes the function ProcessFurther to read only certain part of the array*/
}
This again works ok if the data is char type . but crashes if it is of any different type .
Can anyone suggest how to properly implement this ?
Regards,
Khan
Sounds like alignment could be the issue, but you don't specify which platform you're doing this on (different CPU architectures have different alignment requirements).
If the header's length is "wrong" for the alignment of the following data, that could cause access violations.
Something surprise me in this code. Is your header actually a string ? If it is a struct, of something similar you should replace strlen with sizeof. Calling strlen on non zero terminated string is likely to cause crashes.
The second thing that surprise me is that when reading received data, you should copy the header somewhere. If not using it, why bother sending it over the wire ?
EDIT: OK, the header is some http like header string. There should not be any problem from there, and it indeed does not need to be analysed if you're just testing.
And you should move the data to the place you actually need it, moving it to the beginning of the buffer does not look like the right thing to do.
If the problem comes from alignment, it will disappear if you copy the data to some variable of the real target type at byte level before using it.
There is another solution: allocate your buffer with malloc and put the data structure you want at the beginning. Then you should be able to cast it. Addresses returned by malloc are compatible with any type.
Also be aware that if you were working with C++, casting to a non-trivial class is unlikely to work (for one thing vtables are likely to get a wrong addresses, and there is other issues).
Another possible source of problem is the way you get data_length. It should be a number of bytes. Are you sure it is not a number of items ? To be sure we need some hint of the calling code.
memcpy's behaviour is undefined if the source and target overlap (as in this instance) you should be using memmove()
What exactly is happening when what is not char*? These functions will generally cast to void* before actually doing any work...
It's possible that data_length is not calculated correctly in the calling code. Otherwise this code seems to be fine apart from possible alignment issues mentioned by #unwind.
How is header declared? Does it have variable length? Are you missing a terminating NUL character after the header?
I'd also check to make sure that both sender and receiver use the same byte ordering architecture (little endian vs. big endian).
using unsigned char * solved the issue . thankyou all for your comments.

C++: Determining whether a variable contains no data

I've been messing around in C++ a little bit but I'm still pretty new. I searched around a little bit and even using the keywords of exactly the problem I am trying to tackle yields no results. Basically I am just trying to figure out how to tell if a variable has no data. I have a file that my program reads and it searches for a specific character within that file and basically uses delimiters to determine where to store the actual data in a variable. Now I added some comments in the file saying that it should not be edited which has caused me some problems. So I pretty much want to count the number of comments, but I'm not sure how to do it because the way I had it set up was resulting in huge numbers being returned. So I figured I would attempt to fix it with a simple if statement to see if there was any data in the array while it was running the loop, and if there was then simply add +1 to my variable. Needless to say it did not work. Here's the code. And if you know a better way of doing this, by all means please do share.
size_t arySearchData[20];
size_t commentLines[20];
size_t foundDelimiter;
size_t foundComment;
int commentsNum;
foundDelimiter = lineText.find("]");
foundComment = lineText.find("#");
if (foundComment != std::string::npos) {
commentLines[20] = int(foundComment);
if (foundComment = <PROBLEM>){
commentsNum++;
}
}
So it successfully gets the two comments in my file and recognizes that they are located at the first index(0) in each line but when I tried to have it just do commentsNum++ in my first if statement it just comes up with tons of random numbers, and I am not sure why. So as I said my problem is within the second if statement, I need a void or just a better way to solve this. Any help would be greatly appreciated.
And yes I do realize I could just determine if there 'was' data in the there rather than being void or null but then it would have to be specific and if the comment (#) had a space before it, then it would render my method of reading the file useless as the index will have changed.
A variable in C++ always contains data, just it may not be initialised.
int i;
It will have some value, what it is can't be determined until you do something like
i = 1337;
until you do that the value of i will be what ever happened to be in the memory location that i has been assigned to.
The compile may pick up on the fact that you are trying to use a variable which you have not actually given a value your self, but this will normally just be a warning, as their is nothing wrong as such with doing so
You do not initialize commentsNum. Try this:
int commentsNum = 0;
In C++ other than static variables, other variables are assigned undetermined values. This is primarily done to adhere to underlying philosophy -- "you don't pay for things you don't use", so it doesn't zero that memory by default." However, for static variables, memory is allocated at link time. Unlike runtime initialization, which would need to happen in local variables, link time allocation and initialization incur low cost.
I would recommend hence setting int commentsNum = 0;