C++ Split definition and assignment of variables - c++

So I recently installed CLion and after some setup I started messing with some older code I'd written. Now a great thing about CLion is that it helps you with coding style etc. but one thing I found strange is that it recommended to me to split the definition and assignment of variables. I remember a specific case where I defined strings as follows:
string path = "mypath";
But the IDE recommended to write it like:
string path;
path = "mypath";
Now I started looking for this online to find pros and cons of both methods. Apparently the former is faster but the latter is more secure because it calls the copy constructor or something (I didn't quite understand that part). My question is basically: Since CLion recommends doing it the latter way, does that mean that it is always better? Is there one way that is preferred over the other or is it situational? Do they each have their pros/cons and if so, what are they?
Any help or reliable resources are greatly appreciated.
Thanks in advance,
Xentro

Your IDE needs to be consigned to the bin.
string path = "mypath"; is an infinitely better pattern. This is because, in general, there is no potential hazard of an indeterminate value of path. (Granted, in this case, that is not an issue since a std::string has a well-defined default constructor but the preferred way is a good habit to get into).
The compiler is also spared the headache of constructing then assigning a value to an object.

But the IDE recommended to write it like:
string path;
path = "mypath";
Time to uninstall the IDE. The recommendation is utter nonsense.
Perhaps the worst thing about it is how it prevents const correctness. If path never changes, then you should make it const, and you can only do that if you initialise the variable with the correct value right away:
std::string const path = "mypath";
Note that depending on the context of the code, you may also want to use auto (which will turn the variable into a char pointer, because the deduced type of "mypath" is a char array, but perhaps it turns out you don't even need a full-fletched std::string?) - and that also only works if you do not follow the stupid IDE recommendation:
auto const path = "mypath";
Apparently the former is faster
Nonsense. This is not about speed but about correctness and code maintainability.
but the latter is more secure because it calls the copy constructor or something (I didn't quite understand that part).
That's nonsense, too. It calls one of std::string's overloaded assignment operators, not the copy constructor. And it's not more "secure" in any usual sense of the word.

The advice is nonsense and other answers here tell that already.
I want to add a source for this: Straight from the creator of C++ Bjarne Stroustrup and the chairman on C++ comete (hope I don't get his postion wrong) Herb Sutter:
C++ core guidelines:
ES.20: Always initialize an object
Reason Avoid used-before-set errors
and their associated undefined behavior. Avoid problems with
comprehension of complex initialization. Simplify refactoring. Example
void use(int arg)
{
int i; // bad: uninitialized variable
// ...
i = 7; // initialize i
}
No, i = 7 does not initialize i; it assigns to it. Also, i can be read
in the ... part. Better:
void use(int arg) // OK
{
int i = 7; // OK: initialized
string s; // OK: default initialized
// ...
}
Note The always initialize rule is deliberately stronger than the an object must be set before used language rule. The latter, more
relaxed rule, catches the technical bugs, but:
It leads to less readable code
It encourages people to declare names in greater than necessary scopes
It leads to harder to read code
It leads to logic bugs by encouraging complex code
It hampers refactoring
The always initialize rule is a style rule aimed to improve
maintainability as well as a rule protecting against used-before-set
errors.

string path = "mypath";
The code above is called using a copy constructor
string path;
path = "mypath";
And the other one is called uses the assignment operator.
Theoretically, the pros and cons really depend on how the compiler going to deal with the code. So it vary from compiler to compiler. In general, the first way is trying to build a object by copying another existing object. While the second one is first create a new empty object, then using the assignment operator ("=", the equal mark) to give value to the variable.
You can see more discussions in the following links:
Why separate variable definition and initialization in C++?
and What's the difference between assignment operator and copy constructor?

Related

How to force a compile error in C++(17) if a function return value isn't checked? Ideally through the type system

We are writing safety-critical code and I'd like a stronger way than [[nodiscard]] to ensure that checking of function return values is caught by the compiler.
[Update]
Thanks for all the discussion in the comments. Let me clarify that this question may seem contrived, or not "typical use case", or not how someone else would do it. Please take this as an academic exercise if that makes it easier to ignore "well why don't you just do it this way?". The question is exactly whether it's possible to create a type(s) that fails compiling if it is not assigned to an l-value as the return result of a function call .
I know about [[nodiscard]], warnings-as-errors, and exceptions, and this question asks if it's possible to achieve something similar, that is a compile time error, not something caught at run-time. I'm beginning to suspect it's not possible, and so any explanation why is very much appreciated.
Constraints:
MSVC++ 2019
Something that doesn't rely on warnings
Warnings-as-Errors also doesn't work
It's not feasible to constantly run static analysis
Macros are OK
Not a runtime check, but caught by the compiler
Not exception-based
I've been trying to think how to create a type(s) that, if it's not assigned to a variable from a function return, the compiler flags an error.
Example:
struct MustCheck
{
bool success;
...???...
};
MustCheck DoSomething( args )
{
...
return MustCheck{true};
}
int main(void) {
MustCheck res = DoSomething(blah);
if( !res.success ) { exit(-1); }
DoSomething( bloop ); // <------- compiler error
}
If such a thing is provably impossible through the type system, I'll also accept that answer ;)
(EDIT) Note 1: I have been thinking about your problem and reached the conclusion that the question is ill posed. It is not clear what you are looking for because of a small detail: what counts as checking? How the checkings compose and how far from the point of calling?
For example, does this count as checking? note that composition of boolean values (results) and/or other runtime variable matters.
bool b = true; // for example
auto res1 = DoSomething1(blah);
auto res2 = DoSomething2(blah);
if((res1 and res2) or b){...handle error...};
The composition with other runtime variables makes it impossible to make any guarantee at compile-time and for composition with other "results" you will have to exclude certain logical operators, like OR or XOR.
(EDIT) Note 2: I should have asked before but 1) if the handling is supposed to always abort: why not abort from the DoSomething function directly? 2) if handling does a specific action on failure, then pass it as a lambda to DoSomething (after all you are controlling what it returns, and what it takese). 3) composition of failures or propagation is the only not trivial case, and it is not well defined in your question.
Below is the original answer.
This doesn't fulfill all the (edited) requirements you have (I think they are excessive) but I think this is the only path forward really.
Below my comments.
As you hinted, for doing this at runtime there are recipes online about "exploding" types (they assert/abort on destruction if they where not checked, tracked by an internal flag).
Note that this doesn't use exceptions (but it is runtime and it is not that bad if you test the code often, it is after all a logical error).
For compile-time, it is more tricky, returning (for example a bool) with [[nodiscard]] is not enough because there are ways of no discarding without checking for example assigning to a (bool) variable.
I think the next layer is to active -Wunused-variable -Wunused-expression -Wunused-parameter (and treat it like an error -Werror=...).
Then it is much harder to not check the bool because comparison is pretty much to only operation you can really do with a bool.
(You can assign to another bool but then you will have to use that variable).
I guess that's quite enough.
There are still Machiavelian ways to mark a variable as used.
For that you can invent a bool-like type (class) that is 1) [[nodiscard]] itself (classes can be marked nodiscard), 2) the only supported operation is ==(bool) or !=(bool) (maybe not even copyable) and return that from your function. (as a bonus you don't need to mark your function as [[nodiscard]] because it is automatic.)
I guess it is impossible to avoid something like (void)b; but that in itself becomes a flag.
Even if you cannot avoid the absence of checking, you can force patterns that will immediately raise eyebrows at least.
You can even combine the runtime and compile time strategy.
(Make CheckedBool exploding.)
This will cover so many cases that you have to be happy at this point.
If compiler flags don’t protect you, you will have still a backup that can be detected in unit tests (regardless of taking the error path!).
(And don’t tell me now that you don’t unit test critical code.)
What you want is a special case of substructural types. Rust is famous for implementing a special case called "affine" types, where you can "use" something "at most once". Here, you instead want "relevant" types, where you have to use something at least once.
C++ has no official built-in support for such things. Maybe we can fake it? I thought not. In the "appendix" to this answer I include my original logic for why I thought so. Meanwhile, here's how to do it.
(Note: I have not tested any of this; I have not written any C++ in years; use at your own risk.)
First, we create a protected destructor in MustCheck. Thus, if we simply ignore the return value, we will get an error. But how do we avoid getting an error if we don't ignore the return value? Something like this.
(This looks scary: don't worry, we wrap most of it in a macro.)
int main(){
struct Temp123 : MustCheck {
void f() {
MustCheck* mc = this;
*mc = DoSomething();
}
} res;
res.f();
if(!res.success) print "oops";
}
Okay, that looks horrible, but after defining a suitable macro, we get:
int main(){
CAPTURE_RESULT(res, DoSomething());
if(!res.success) print "oops";
}
I leave the macro as an exercise to the reader, but it should be doable. You should probably use __LINE__ or something to generate the name Temp123, but it shouldn't be too hard.
Disclaimer
Note that this is all sorts of hacky and terrible, and you likely don't want to actually use this. Using [[nodiscard]] has the advantage of allowing you to use natural return types, instead of this MustCheck thing. That means that you can create a function, and then one year later add nodiscard, and you only have to fix the callers that did the wrong thing. If you migrate to MustCheck, you have to migrate all the callers, even those that did the right thing.
Another problem with this approach is that it is unreadable without macros, but IDEs can't follow macros very well. If you really care about avoiding bugs then it really helps if your IDE and other static analyzers understand your code as well as possible.
As mentioned in the comments you can use [[nodiscard]] as per:
https://learn.microsoft.com/en-us/cpp/cpp/attributes?view=msvc-160
And modify to use this warning as compile error:
https://learn.microsoft.com/en-us/cpp/preprocessor/warning?view=msvc-160
That should cover your use case.

Professional way to initialize an empty string in c++

Hi I want to learn a professional way to initialize an emtpy string in c++.
I could do
std::string a_string; // way_1
or
std::string a_string = ""; // way_2
But I think way_1 is fully depending on the default value defined in std package. To me, it is not an explicit declaration, and it might need to be changed if the codes of std::string is changed in the future. way_2 to me is not directly but using "" which is equivalent to an empty string. To me, a built-in empty string is professional, like nullptr for initializing a pointer.
Do you know how would C++ professional programmers initialize an empty string?
Thanks
But I think way_1 is fully depending on the default value defined in std package. To me, it is not an explicit declaration, and it might need to be changed if the codes of std::string is changed in the future.
This is faulty reasoning. The default value of std::string is defined by the C++ standard and is just as stable as every other part of the language. If you are worried about it changing, then you should also be worrying about std::string disappearing completely, or the meaning of "" changing, or the meaning of the initialization with the empty string changing.
If you are going for "professional", then I, if I saw a programmer using the explicit form, would wonder whether this programmer doesn't know what the default initialization of such a fundamental type as std::string does, and therefore what else this programmer might not know.

Should I use const for local variables for better code optimization?

I often use const for local variables that are not being modified, like this:
const float height = person.getHeight();
I think it can make the compiled code potentially faster, allowing the compiler to do some more optimization. Or am I wrong, and compilers can figure out by themselves that the local variable is never modified?
Or am I wrong, and compilers can figure out by themselves that the local variable is never modified?
Most of the compilers are smart enough to figure this out themselves.
You should rather use const for ensuring const-correctness and not for micro-optimization.
const correctness lets compiler help you guard against making honest mistakes, so you should use const wherever possible but for maintainability reasons & preventing yourself from doing stupid mistakes.
It is good to understand the performance implications of code we write but excessive micro-optimization should be avoided. With regards to performance one should follow the,
80-20 Rule:
Identify the 20% of your code which uses 80% of your resources, through profiling on representative data sets and only then attempt to optimize those bottlenecks.
This performance difference will almost certainly be negligible, however you should be using const whenever possible for code documentation reasons. Often times, compilers can figure this out for your anyway and make the optimizations automatically. const is really more about code readability and clarity than performance.
If there is a value type on the left hand, you may safely assume that it will have a negligible effect, or none at all. It's not going to influence overload resolution, and what is actually const can easily be deduced from the scope.
It's an entirely different matter with reference types:
std::vector<int> v(1);
const auto& a = v[0];
auto& b = v[0];
These two assignments resolve to two entirely different operators, and similar overload pairs are found in many libraries aside STL too. Even in this simple example, optimizations which depend on v having been immutable for the scope of b are already no longer trivial and less likely to be found.
The STL is still quite tame in these terms though, in such that at least the behavior doesn't change based on choosing the const_reference overload or not. For most of STL, the const_reference overload is only tied to the object being const itself.
Some other libraries (e.g. Qt) make heavy use of copy-on-write semantics. In these const-correctness with references is no longer optional, but necessary:
QVector<int> v1(1);
auto v2 = v1; // Backing storage of v2 and v1 is still linked
const auto& a = v1[0]; // Still linked
const auto& b = v2[0]; // Still linked
auto& c = v2[0]; // Deep copy from v1 to v2 is happening now :(
// Even worse, &b != &c
Copy-on-write semantics are something commonly found in large-matrix or image manipulation libraries, and something to watch out for.
It's also something where the compiler is no longer able to save you, the overload resolution is mandated by C++ standard and there is no leeway for eliminating the costly side effects.
I don't think it is a good practice to make local variables, including function parameters, constant by default.
The main reason is brevity. Good coding practices allow you to make your code short, this one doesn't.
Quite similarly, you can write void foo(void) in your function declarations, and you can justify it by increased clarity, being explicit about not intending to pass a parameter to the function, etc, but it is essentially a waste of space, and eventually almost fell out of use. I think the same thing will happen to the trend of using const everywhere.
Marking local variables with a const qualifier is not very useful for most of the collaborators working with the code you create. Unlike class members, global variables, or the data pointed to a by a pointer, a local variable doesn't have any external effects, and no one would ever be restricted by the qualifier of the local variable or learn anything useful from it (unless he is going to change the particular function where the local variable is).
If he needs to change your function, the task should not normally require him to try to deduce valuable information from the constant qualifier of a variable there. The function should not be so large or difficult to understand; if it is, probably you have more serious problems with your design, consider refactoring. One exception is functions that implement some hardcore math calculations, but for those you would need to put some details or a link to the paper in your comments.
You might ask why not still put the const qualifier if it doesn't cost you much effort. Unfortunately, it does. If I had to put all const qualifiers, I would most likely have to go through my function after I am done and put the qualifiers in place - a waste of time. The reason for that is that you don't have to plan the use of local variables carefully, unlike with the members or the data pointed to by pointers.
They are mostly a convenience tool, technically, most of them can be avoided by either factoring them into expressions or reusing variables. So, since they are a convenience tool, the very existence of a particular local variable is merely a matter of taste.
Particularly, I can write:
int d = foo(b) + c;
const int a = foo(b); int d = a + c;
int a = foo(b); a += c
Each of the variants is identical in every respect, except that the variable a is either constant or not, or doesn't exist at all. It is hard to commit to some of the choices early.
There is one major problem with local constant values - as shown in the code below:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
// const uint32_t const_dummy = 0;
void func1(const uint32_t *ptr);
int main(void) {
const uint32_t const_dummy = 0;
func1(&const_dummy);
printf("0x%x\n", const_dummy);
return EXIT_SUCCESS;
}
void func1(const uint32_t *ptr) {
uint32_t *tmp = (uint32_t *)ptr;
*tmp = 1;
}
This code was compiled on Ubuntu 18.04.
As you can see, const_dummy's value can be modified in this case!
But, if you modify the code and set const_dummy scope to global - by commenting out the local definition and remove the comment from the global definition - you will get an exception and your our program will crash - which is good, because you can debug it and find the problem.
What is the reason? Well global const values are located in the ro (read only) section of the program. The OS - protects this area using the MMU.
It is not possible to do it with constants defined in the stack.
With systems that don't use the MMU - you will not even "feel" that there is a problem.

Use of const double for intermediate results

I a writing a Simulation program and wondering if the use of const double is of any use when storing intermediate results. Consider this snippet:
double DoSomeCalculation(const AcModel &model) {
(...)
const double V = model.GetVelocity();
const double m = model.GetMass();
const double cos_gamma = cos(model.GetFlightPathAngleRad());
(...)
return m*V*cos_gamma*Chi_dot;
}
Note that the sample is there only to illustrate -- it might not make to much sense from the engineering side of things. The motivation of storing for example cos_gamma in a variable is that this cosine is used many time in other expressions covered by (...) and I feel that the code gets more readable when using
cos_gamma
rather than
cos(model.GetFlightPathAngleRad())
in various expressions. Now the actual is question is this: since I expect the cosine to be the same througout the code section and I actually created the thing only as a placeholder and for convenience I tend to declare it const. Is there a etablished opinion on wether this is good or bad practive or whether it might bite me in the end? Does a compiler make any use of this additional information or am I actually hindering the compiler from performing useful optimizations?
Arne
I am not sure about the optimization part, but I think it is good to declare it as const. This is because if code is large, then if somebody incorrectly does a cos_gamma = 1 in between you will get a compiler error instead of run time surprises.
You can certainly help things by only computing the cosine once and using it everywhere. Making that result const is a great way to ensure you (or someone else) don't try to change it somewhere down the road.
A good rule of thumb here is to make it correct and readable first. Don't worry about any optimizations the compiler might or might not make. Only after profiling and discovering that a piece of code is indeed too slow should you worry about helping the compiler optimize things.
Given your code:
const double V = model.GetVelocity();
const double m = model.GetMass();
const double cos_gamma = cos(model.GetFlightPathAngleRad());
I would probably leave cos_gamma as it is. I'd consider changing V and m to references though:
const double &V = model.GetVelocity();
const double &m = model.GetMass();
This way you're making it clear that these are strictly placeholders. It does, however, raise the possibility of lifetime issues -- if you use a reference, you clearly have to ensure that what it refers to has sufficient lifetime. At least from the looks of things, this probably won't be a problem though. First of all, GetVelocity() and GetMass() probably return values, not references (in which case you're initializing the references with temporaries, and the lifetime of the temporary is extended to the lifetime of the reference it initializes). Second, even if you return an actual reference, it's apparently to a member of the model, which (at a guess) will exist throughout the entire calculation in question anyway.
I wish I worked with more code written like this!
Anything you can do to make your code more readable can only be a good thing. Optimize only when you need to optimize.
My biggest beef with C++ is that values are not const by default. Use const liberally and keep your feet bullet-hole free!
Whether the compiler makes use of this is an interesting question - once you're at the stage of optimizing a fully working program. Until then, you write the program mostly for the humans coming later and having to look at the code. The compiler will happily swallow (objectively) very unreadable code, lacking spaces, newlines, and sporting hilarious indentation. Humans won't.
The most important question is, how readable is the code, and how easy it is to make a mistake changing it. Here, const helps tremendously, because it makes the compiler bark at everyone who mistakenly changes anything that shouldn't change. I always make everything const unless I really, really want it to be changeable.

How can I trust the behavior of C++ functions that declare const?

This is a C++ disaster, check out this code sample:
#include <iostream>
void func(const int* shouldnotChange)
{
int* canChange = (int*) shouldnotChange;
*canChange += 2;
return;
}
int main() {
int i = 5;
func(&i);
std::cout << i;
return 0;
}
The output was 7!
So, how can we make sure of the behavior of C++ functions, if it was able to change a supposed-to-be-constant parameter!?
EDIT: I am not asking how can I make sure that my code is working as expected, rather I am wondering how to believe that someone else's function (for instance some function in some dll library) isn't going to change a parameter or posses some behavior...
Based on your edit, your question is "how can I trust 3rd party code not to be stupid?"
The short answer is "you can't." If you don't have access to the source, or don't have time to inspect it, you can only trust the author to have written sane code. In your example, the author of the function declaration specifically claims that the code will not change the contents of the pointer by using the const keyword. You can either trust that claim, or not. There are ways of testing this, as suggested by others, but if you need to test large amounts of code, it will be very labour intensive. Perhaps moreso than reading the code.
If you are working on a team and you have a team member writing stuff like this, then you can talk to them about it and explain why it is bad.
By writing sane code.
If you write code you can't trust, then obviously your code won't be trustworthy.
Similar stupid tricks are possible in pretty much any language. In C#, you can modify the code at runtime through reflection. You can inspect and change private class members. How do you protect against that? You don't, you just have to write code that behaves as you expect.
Apart from that, write a unittest testing that the function does not change its parameter.
The general rule in C++ is that the language is designed to protect you from Murphy, not Machiavelli. In other words, its meant to keep a maintainance programmer from accidentally changing a variable marked as const, not to keep someone from deliberatly changing it, which can be done in many ways.
A C-style cast means all bets are off. It's sort of like telling the compiler "Trust me, I know this looks bad, but I need to do this, so don't tell me I'm wrong." Also, what you've done is actually undefined. Casting off const-ness and then modifying the value means the compiler/runtime can do anything, including e.g. crash your program.
The only thing I can suggest is to allocate the variable shouldNotChange from a memory page that is marked as read-only. This will force the OS/CPU to raise an error if the application attempts to write to that memory. I don't really recommend this as a general method of validating functions just as an idea you may find useful.
The simplest way to enforce this would be to just not pass a pointer:
void func(int shouldnotChange);
Now a copy will be made of the argument. The function can change the value all it likes, but the original value will not be modified.
If you can't change the function's interface then you could make a copy of the value before calling the function:
int i = 5;
int copy = i
func(&copy);
Don't use C style casts in C++.
We have 4 cast operators in C++ (listed here in order of danger)
static_cast<> Safe (When used to 'convert numeric data types').
dynamic_cast<> Safe (but throws exceptions/returns NULL)
const_cast<> Dangerous (when removing const).
static_cast<> Very Dangerous (When used to cast pointer types. Not a very good idea!!!!!)
reinterpret_cast<> Very Dangerous. Use this only if you understand the consequences.
You can always tell the compiler that you know better than it does and the compiler will accept you at face value (the reason being that you don't want the compiler getting in the way when you actually do know better).
Power over the compiler is a two edged sword. If you know what you are doing it is a powerful tool the will help, but if you get things wrong it will blow up in your face.
Unfortunately, the compiler has reasons for most things so if you over-ride its default behavior then you better know what you are doing. Cast is one the things. A lot of the time it is fine. But if you start casting away const(ness) then you better know what you are doing.
(int*) is the casting syntax from C. C++ supports it fully, but it is not recommended.
In C++ the equivalent cast should've been written like this:
int* canChange = static_cast<int*>(shouldnotChange);
And indeed, if you wrote that, the compiler would NOT have allowed such a cast.
What you're doing is writing C code and expecting the C++ compiler to catch your mistake, which is sort of unfair if you think about it.