C++: Determining whether a variable contains no data - c++

I've been messing around in C++ a little bit but I'm still pretty new. I searched around a little bit and even using the keywords of exactly the problem I am trying to tackle yields no results. Basically I am just trying to figure out how to tell if a variable has no data. I have a file that my program reads and it searches for a specific character within that file and basically uses delimiters to determine where to store the actual data in a variable. Now I added some comments in the file saying that it should not be edited which has caused me some problems. So I pretty much want to count the number of comments, but I'm not sure how to do it because the way I had it set up was resulting in huge numbers being returned. So I figured I would attempt to fix it with a simple if statement to see if there was any data in the array while it was running the loop, and if there was then simply add +1 to my variable. Needless to say it did not work. Here's the code. And if you know a better way of doing this, by all means please do share.
size_t arySearchData[20];
size_t commentLines[20];
size_t foundDelimiter;
size_t foundComment;
int commentsNum;
foundDelimiter = lineText.find("]");
foundComment = lineText.find("#");
if (foundComment != std::string::npos) {
commentLines[20] = int(foundComment);
if (foundComment = <PROBLEM>){
commentsNum++;
}
}
So it successfully gets the two comments in my file and recognizes that they are located at the first index(0) in each line but when I tried to have it just do commentsNum++ in my first if statement it just comes up with tons of random numbers, and I am not sure why. So as I said my problem is within the second if statement, I need a void or just a better way to solve this. Any help would be greatly appreciated.
And yes I do realize I could just determine if there 'was' data in the there rather than being void or null but then it would have to be specific and if the comment (#) had a space before it, then it would render my method of reading the file useless as the index will have changed.

A variable in C++ always contains data, just it may not be initialised.
int i;
It will have some value, what it is can't be determined until you do something like
i = 1337;
until you do that the value of i will be what ever happened to be in the memory location that i has been assigned to.
The compile may pick up on the fact that you are trying to use a variable which you have not actually given a value your self, but this will normally just be a warning, as their is nothing wrong as such with doing so

You do not initialize commentsNum. Try this:
int commentsNum = 0;

In C++ other than static variables, other variables are assigned undetermined values. This is primarily done to adhere to underlying philosophy -- "you don't pay for things you don't use", so it doesn't zero that memory by default." However, for static variables, memory is allocated at link time. Unlike runtime initialization, which would need to happen in local variables, link time allocation and initialization incur low cost.
I would recommend hence setting int commentsNum = 0;

Related

Resizing struct / char array (to reduce memory usage)

This is my first project on Arduino/C++/ESP32. I wrote a fairly big program and got almost everything working - except that in the end I realized that the device would run out of breath (memory) periodically and go for a reboot. The reboot is because I configured a watchdog to do so.
There is one area where I think there's a chance to reduce the memory usage but my experience on c++ is "not there yet" for me to be able to write this by myself. Any pointers (no pun intended) please? I have been on this since yesterday and getting rid of one error only results in another new error popping up. Moreover I don't want to come up with something that is hacky or might break later. It should be a quick answer for the experienced people here.
Let me explain the code that I prefer to refactor/optimize.
I need to store a bunch of records that I would need to read/manipulate later. I declared a struct (because they are related fields) globally. Now the issue is that I may need to store 1 record, 2 records or 5 records which I would only know later once I read the data from the EEPROM. And this has to be accessible to all the functions so it has to be a global declaration.
To summarize
Question 1 - how to set "NumOfrecs" later in the program once the data is read from the eeprom.
Question 2 - The size(sizeOfUsername) of the char array username can also change depending upon the length of the username read from the eeprom. At times it might be 5 characters long, at times it could be 25. I can set it to a max 25 and solve this problem but then wouldn't I be wasting memory if many usernames were just 4-5 characters long? So in short - just before copying over the data in eeprom into the "username" char array, is it possible to set it's size to the optimal size required for holding that data ( which is the data size + 1 byte for null termination ).
struct stUSRREC {
char username[sizeOfUsername];
bool online;
};
stUSRREC userRecords[NumOfrecs];
I familiarized myself with a whole bunch of functions like strcpy, memset, malloc etc but now I have run out of time and need to keep the learning part for another day.
I can try to do this in a slightly different manner where I don't use the struct and instead use individual char arrays ( for each field like username ). But then again I'll have to resize the arrays as I read the data from the eeprom.
I can explain all the things I have tried but that will make this question unnecessarily long and perhaps result in losing some clarity. Greatly appreciate any help.
While responding to Q&A on SO I was trying some random stuff and at least this little piece of code below seems to work ( in terms of storing smaller/bigger values )
struct stUSRREC {
char username[];
bool online;
};
stUSRREC userRecords[5];
Then manipulate it this way
strcpy(userRecords[0].username, "MYUSERNAME");
strcpy(userRecords[0].username, "test");
strcpy(userRecords[0].username, "MYVERYBIGUSERNAME");
I have been able to write/rewrite different lengths (above) and can read all of them back correctly. Resizing "userRecords" might be a different game but that can wait a little
One thing I forgot to mention was that I will need to size/resize the array ( holding username ) ONLY ONCE. In the setup() itself I can read/load the required data into those arrays. I am not sure if that opens up any other possibility. The rest of the struct/array I need to manipulate during the running are only boolean and int values. This is not an issue at all because there is no resizing required to do so.
On a side note I am pretty sure I am not the only one who faced this situation. Any tips/clues/pointers could be of help to many others. The constraints on little devices like ESP32 become more visible when you really start loading them with a bunch of things. I had it all working with "Strings" (the capital S) but the periodic reboot (cpu starvation?) required me to get rid of the Strings. Even otherwise I hear that using Strings (on ESP, Arduino and gang) is a bad idea.
You tagged this question as C++, so I'll ask:
Can you use vector and string in your embedded code?
#include <string>
#include <vector>
struct stUSRREC {
std::string username;
bool online;
stUSRREC(const char* name, bool isOnline) :
username(name),
online(isOnline)
{
}
};
std::vector<stUSRREC> userRecords;
The use of string as the username type means you only allocate as many characters needed to hold the name instead of allocated an assumed max size of sizeOfUsername. The use of vector allows you to dynamically grow your record set.
Then to add a new record:
stUSRREC record("bob", true);
userRecords.push_back(record);
And you may not need NumOfrecs anymore. That's covered by userRecrods.size()

Using a for loop with changing commands as a variable

I've been looking around the web and this site for an answer to this scenario but everything I've come across is about reading it from an outside file or changing what the command is in the code but not changing what it does. I'm just messing around with code to refresh myself before I do anything practical. I'm verifying that certain constants are equal to a number that I have specified. (I've never posted here and I've been doing this all day so I'm not taking the time to learn the code insert tags.)
string one = "CHAR_MAX"; // <<< I know this doesn't work. It's what I am
// trying to do in the loop.
if (one == 127)
cout << "Max char count: " << CHAR_MAX << ">>> Pass >>> " one;
I know there are other only slightly more tedious ways to accomplish this. But I'm fairly sure there is a way to do this without an external .txt file to be read from and I've spent far too long trying to figure it out. It's driving me crazy and it's been almost 3 hours since I got to this.
Edit:
I'll look more into the 'constexpr' but from what I'm seeing I think it may. There are numerous other ways for me to complete this yes. I just want to understand a way in that backwards manner. For comprehension. As for as intentions (unless you mean something other with 'intend') go I'm looking at different ways to accomplish a silly program that has several variable limit constants. Like min long, max long, short, max int, etc. And I was thinking of ways to compare them with what the number is. Not for any reason to use. It'd be completely useless because they are predefined. I just thought of assigning the commands (forget what they are referred to in source code) such as CHAR_MAX to a variable that changes along with a for loop after outputting the results. I would have to define them in a list prior but I just couldn't figure out how. (Also: Thanks to the mod who changed my code block to read correctly.)
2nd Edit: Take all variable limits. 18446744073709551615 for unsigned long long. 4294967295 for long (not sure why it's this way, int is the same). Get those numbers with associated commands but by means of a for loop where the command is equal to a variable. (AKA 1 or even "one") It doesn't matter the name as long as it can contain the same command. I have a feeling "variable" is an incorrect way to phrase this but you'd be storing it with a changeable memory assigned call that you can use in a for loop by incrementing a counter for that loop and outputting an
if(*command-as-variable* == *what that number is as a corresponding number*)
cout << "Pass";
else cout << "Fail";
enter code here
Depending on the circumstances I associate the terms with it COULD fail but if everything works correctly it should not. Like I said, Meaningless. Just a different way I could more efficiently write this code instead of having 19 different cout statements. It's how to execute the idea I am trying to find out.
I still haven't looked at what constexpr is for but I am about to pass out now. It's been a long day. I'll edit this tomorrow after I look. Or if it's explained that'd be even better! :)
You are assigning a text string to a string variable, then comparing a string variable to an integer. Your compiler should generate at least some warnings.
Maybe you want this:
constexpr int one = CHAR_MAX;
if (one == 127)
//...
The CHAR_MAX is a predefined constant (identifier/macro).
You could also do this:
if (CHAR_MAX == 127)
{
//...

Use of Literals, yay/nay in C++

I've recently heard that in some cases, programmers believe that you should never use literals in your code. I understand that in some cases, assigning a variable name to a given number can be helpful (especially in terms of maintenance if that number is used elsewhere). However, consider the following case studies:
Case Study 1: Use of Literals for "special" byte codes.
Say you have an if statement that checks for a specific value stored in (for the sake of argument) a uint16_t. Here are the two code samples:
Version 1:
// Descriptive comment as to why I'm using 0xBEEF goes here
if (my_var == 0xBEEF) {
//do something
}
Version 2:
const uint16_t kSuperDescriptiveVarName = 0xBEEF;
if (my_var == kSuperDescriptiveVarName) {
// do something
}
Which is the "preferred" method in terms of good coding practice? I can fully understand why you would prefer version 2 if kSuperDescriptiveVarName is used more than once. Also, does the compiler do any optimizations to make both versions effectively the same executable code? That is, are there any performance implications here?
Case Study 2: Use of sizeof
I fully understand that using sizeof versus a raw literal is preferred for portability and also readability concerns. Take the two code examples into account. The scenario is that you are computing the offset into a packet buffer (an array of uint8_t) where the first part of the packet is stored as my_packet_header, which let's say is a uint32_t.
Version 1:
const int offset = sizeof(my_packet_header);
Version 2:
const int offset = 4; // good comment telling reader where 4 came from
Clearly, version 1 is preferred, but what about for cases where you have multiple data fields to skip over? What if you have the following instead:
Version 1:
const int offset = sizeof(my_packet_header) + sizeof(data_field1) + sizeof(data_field2) + ... + sizeof(data_fieldn);
Version 2:
const int offset = 47;
Which is preferred in this case? Does is still make sense to show all the steps involved with computing the offset or does the literal usage make sense here?
Thanks for the help in advance as I attempt to better my code practices.
Which is the "preferred" method in terms of good coding practice? I can fully understand why you would prefer version 2 if kSuperDescriptiveVarName is used more than once.
Sounds like you understand the main point... factoring values (and their comments) that are used in multiple places. Further, it can sometimes help to have a group of constants in one place - so their values can be inspected, verified, modified etc. without concern for where they're used in the code. Other times, there are many constants used in proximity and the comments needed to properly explain them would obfuscate the code in which they're used.
Countering that, having a const variable means all the programmers studying the code will be wondering whether it's used anywhere else, keeping it in mind as they inspect the rest of the scope in which it's declared etc. - the less unnecessary things to remember the surer the understanding of important parts of the code will be.
Like so many things in programming, it's "an art" balancing the pros and cons of each approach, and best guided by experience and knowledge of the way the code's likely to be studied, maintained, and evolved.
Also, does the compiler do any optimizations to make both versions effectively the same executable code? That is, are there any performance implications here?
There's no performance implications in optimised code.
I fully understand that using sizeof versus a raw literal is preferred for portability and also readability concerns.
And other reasons too. A big factor in good programming is reducing the points of maintenance when changes are done. If you can modify the type of a variable and know that all the places using that variable will adjust accordingly, that's great - saves time and potential errors. Using sizeof helps with that.
Which is preferred [for calculating offsets in a struct]? Does is still make sense to show all the steps involved with computing the offset or does the literal usage make sense here?
The offsetof macro (#include <cstddef>) is better for this... again reducing maintenance burden. With the this + that approach you illustrate, if the compiler decides to use any padding your offset will be wrong, and further you have to fix it every time you add or remove a field.
Ignoring the offsetof issues and just considering your this + that example as an illustration of a more complex value to assign, again it's a balancing act. You'd definitely want some explanation/comment/documentation re intent here (are you working out the binary size of earlier fields? calculating the offset of the next field?, deliberately missing some fields that might not be needed for the intended use or was that accidental?...). Still, a named constant might be enough documentation, so it's likely unimportant which way you lean....
In every example you list, I would go with the name.
In your first example, you almost certainly used that special 0xBEEF number at least twice - once to write it and once to do your comparison. If you didn't write it, that number is still part of a contract with someone else (perhaps a file format definition).
In the last example, it is especially useful to show the computation that yielded the value. That way, if you encounter trouble down the line, you can easily see either that the number is trustworthy, or what you missed and fix it.
There are some cases where I prefer literals over named constants though. These are always cases where a name is no more meaningful than the number. For example, you have a game program that plays a dice game (perhaps Yahtzee), where there are specific rules for specific die rolls. You could define constants for One = 1, Two = 2, etc. But why bother?
Generally it is better to use a name instead of a value. After all, if you need to change it later, you can find it more easily. Also it is not always clear why this particular number is used, when you read the code, so having a meaningful name assigned to it, makes this immediately clear to a programmer.
Performance-wise there is no difference, because the optimizers should take care of it. And it is rather unlikely, even if there would be an extra instruction generated, that this would cause you troubles. If your code would be that tight, you probably shouldn't rely on an optimizer effect anyway.
I can fully understand why you would prefer version 2 if kSuperDescriptiveVarName is used more than once.
I think kSuperDescriptiveVarName will definitely be used more than once. One for check and at least one for assignment, maybe in different part of your program.
There will be no difference in performance, since an optimization called Constant Propagation exists in almost all compilers. Just enable optimization for your compiler.

Struggling with sprintf... something stupid?

Sorry to pester everyone, but this has been causing me some pain. Here's the code:
char buf[500];
sprintf(buf,"D:\\Important\\Calibration\\Results\\model_%i.xml",mEstimatingModelID);
mEstimatingModelID is an integer, currently holding value 0.
Simple enough, but debugging shows this is happening:
0x0795f630 "n\Results\model_0.xml"
I.e. it's missing the start of the string.
Any ideas? This is simple stuff, but I can't figure it out.
Thanks!
In an effort to make this an actual general answer: Here's a checklist for similar errors:
Never trust what you see in release mode, especially local variables that have been allocated from stack memory. Static variables that exist in heap data are about the only thing that will generally be correct but even then, don't trust it. (Which was the case for the user above)
It's been my experience that the more recent versions of VS have less reliable release mode data (probably b/c they optimize much more in release, or maybe it's 64bitness or whatever)
Always verify that you are examining the variable in the correct function. It is very easy to have a variable named "buf" in a higher function that has some uninitialized garbage in it. This would be easily confused with the same named variable in the lower subroutine/function.
It's always a good idea to double check for buffer overruns. If you ever use a %s in your sprintf, you could get a buffer overrun.
Check your types. sprintf is pretty adaptable and you can easily get a non-crashing but strange result by passing in a string pointer when an int is expected etc.

Magic Numbers In Arrays? - C++

I'm a fairly new programmer, and I apologize if this information is easily available out there, I just haven't been able to find it yet.
Here's my question:
Is is considered magic numbers when you use a literal number to access a specific element of an array?
For example:
arrayOfNumbers[6] // Is six a magic number in this case?
I ask this question because one of my professors is adamant that all literal numbers in a program are magic numbers. It would be nice for me just to access an element of an array using a real number, instead of using a named constant for each element.
Thanks!
That really depends on the context. If you have code like this:
arr[0] = "Long";
arr[1] = "sentence";
arr[2] = "as";
arr[3] = "array.";
...then 0..3 are not considered magic numbers. However, if you have:
int doStuff()
{
return my_global_array[6];
}
...then 6 is definitively a magic number.
It's pretty magic.
I mean, why are you accessing the 6th element? What's are the semantics that should be applied to that number? As it stands all we know is "the 6th (zero-based) number". If we knew the declaration of arrayOfNumbers we would further know its type (e.g. an int or a double).
But if you said:
arrayOfNumbers[kDistanceToSaturn];
...now it has much more meaning to someone reading the code.
In general one iterates over an array, performing some operation on each element, because one doesn't know how long the array is and you can't just access it in a hardcoded manner.
However, sometimes array elements have specific meanings, for example, in graphics programming. Sometimes an array is always the same size because the data demands it (e.g. certain transform matrices). In these cases it may or may not be okay to access the specific element by number: domain experts will know what you're doing, but generalists probably won't. Giving the magic index number a name makes it more obvious to those who have to maintain your code, and helps you to prevent typing the wrong one accidentally.
In my example above I assumed your array holds distances from the sun to a planet. The sun would be the zeroth element, thus arrayOfNumbers[kDistanceToSun] = 0. Then as you increment, each element contains the distance to the next farthest planet: mercury, venus, etc. This is much more readable than just typing the number of the planet you want. In this case the array is of a fixed size because there are a fixed number of planets (well, except the whole Pluto debacle).
The other problem is that "arrayOfNumbers" tells us nothing about the contents of the array. We already know its an array of numbers because we saw the declaration somewhere where you said int arrayOfNumers[12345]; or however you declared it. Instead, something like:
int distanceToPlanetsFromSol[kNumberOfPlanets];
...gives us a much better idea of what the data actually is and what its semantics are. One of your goals as a programmer should be to write code that is self-documenting in this manner.
And then we can argue elsewhere if kNumberOfPlanets should be 8 or 9. :)
You should ask yourself why are you accessing that particular position. In this case, I assume that if you are doing arrayOfNumbers[6] the sixth position has some special meaning. If you think what's that meaning, you probably realize that it's a magic number hiding that.
another way to look at it:
What if after some chance the program needs to access 7th element instead of 6th? HOw would you or a maintainer know that? If for example if the 6th entry is the count of trees in CA it would be a good thing to put
#define CA_STATE_ENTRY 6
Then if now the table is reordered somebody can see that they need to change this to 9 (say). BTW I am not saying this is the best way to maintain an array for tree counts by state - it probably isnt.
Likewise, if later people want to change the program to deal with trees in oregon, then they know to replace
trees[CA_STATE_ENTRY]
with
trees[OR_STATE_ENTRY]
The point is
trees[6]
is not self-documenting
Of course for c++ it should be an enum not a #define
You'd have to provide more context for a meaningful answer. Not all literal numbers are magic, but many are. In a case like that there is no way at all to tell for sure, though most cases I can think of off-hand with an explicit array index >>1 probably qualify as magic.
Not all literals in a program really qualify as "magic numbers" -- but this one certainly seems to. The 6 gives us no clue of why you're accessing that particular element of the array.
To not be a magic number, you need its meaning to be quite clear even on first examination (or at least minimal examination) why that value is being used. Just for example, a lot of code will do things like: &x[0]. In this case, it's typically pretty clear that the '0' really just means "the beginning of the array."
If you need to access a particular element of the array, chances are you're doing it wrong.
You should almost always be iterating over the entire array.
It's only not a magic number if your program is doing something very special involving the number six specifically. Could you provide some context?
That's the problem with professors, they're often too academic. In theory he's right, as usual, but usually magic numbers are used in a stricter context, when the number is embedded in a data stream, allowing you to detect certain properties of the stream (like the signature header of a file type for instance).
See also this Wikipedia entry.
Usually not all constant values in software are called magic numbers.
A java class files always starts with the hex value 0xcafebabe a windows .exe
file with MZ 0x4d, 0x5a , this allows you quickly (but not for sure) to identify
the content of a binary file.
In a MISRA compliant system, all values except 0 and 1 are considered magic numbers. My opinion has always been if the constant value is obvious or likely won't change then leave it as a number. If in doubt create a unique constant since long term maintenance will be easier.