Const-correctness of function parameters on API changes - c++

Suppose I am using a library which implements the function foo, and my code could look something like this:
void foo(const int &) { }
int main() {
int x = 1;
foo(x);
std::cout << (1/x) << std::endl;
}
Everything works fine. But now suppose at one point either foo gets modified or overloaded for some reason. Now what we get could be something like this:
void foo(int & x) {
x--;
}
void foo(const int &) {}
int main() {
int x = 1;
foo(x);
std::cout << (1/x) << std::endl;
}
BAM. Suddenly the program breaks. This is because what we actually wanted to pass in that snippet was a constant reference, but with the API change suddenly the compiler selects the version we don't want and the program breaks unexpectedly.
What we wanted was actually this:
int main() {
int x = 1;
foo(static_cast<const int &>(x));
std::cout << (1/x) << std::endl;
}
With this fix, the program starts working again. However, I must say I've not seen many of these casts around in code, as everybody seems to simply trust this type of errors not to happen. In addition, this seems needlessly verbose, and if there's more than one parameter and names start to become longer, function calls get really messy.
Is this a reasonable concern and how should I go about it?

If you change a function that takes a const reference so that it no longer is a const, you are likely to break things. This means you have to inspect EVERY place where that function is called, and ensure that it is safe. Further having two functions with the same name, one with const and one without const in this sort of scenario is definitely a bad plan.
The correct thing to do is to create a new function, which does the x-- variant, with a different name from the existing one.
Any API supplier that does something like this should be severely and physically punished, possibly with slightly less violence involved if there is a BIG notice in the documentation saying "We have changed function foo, it now decrements x unless the parameter is cast to const". It's one of the worst possible binary breaks one can imagine (in terms of "it'll be terribly hard to find out what went wrong").

Related

Sometimes a good practice to initialize a class pointer member variable to itself?

For a strictly internal class that is not intended to be used as part of an API provided to an external client, is there anything inherently evil with initializing a class pointer member variable to itself rather than NULL or nullptr?
Please see the below code for an example.
#include <iostream>
class Foo
{
public:
Foo() :
m_link(this)
{
}
Foo* getLink()
{
return m_link;
}
void setLink(Foo& rhs)
{
m_link = &rhs;
// Do other things too.
// Obviously, the name shouldn't be setLink() if the real code is doing multiple things,
// but this is a code sample.
}
void changeState()
{
// This is a code sample, but play along and assume there are actual states to change.
std::cout << "Changing a state." << std::endl;
}
private:
Foo* m_link;
};
void doSomething(Foo& foo)
{
Foo* link = foo.getLink();
if (link == &foo)
{
std::cout << "A is not linked to anything." << std::endl;
}
else
{
std::cout << "A is linked to something else. Need to change the state on the link." << std::endl;
link->changeState();
}
}
int main(int argc, char** argv)
{
Foo a;
doSomething(a);
std::cout << "-------------------" << std::endl;
// This is a mere code sample.
// In the real code, I'm fetching B from a container.
Foo b;
a.setLink(b);
doSomething(a);
return 0;
}
Output
A is not linked to anything.
-------------------
A is linked to something else. Need to change the state on the link.
Changing a state.
Pros
The benefit to initializing the pointer variable, Foo::link, to itself is to avoid accidental NULL dereferences. Since the pointer can never be NULL, then at worst, the program will produce erroneous output rather than segmentation fault.
Cons
However, the clear downside to this strategy is that it appears to be unconventional. Most programmers are used to checking for NULL, and thus don't expect to check for equality with the object invoking the pointer. As such, this technique would be ill-advised to use in a codebase that is targeted for external consumers, that is, developers expecting to use this codebase as a library.
Final Remarks
Any thoughts from anyone else? Has anyone else said anything substantial on this subject, especially with C++98 in consideration? Note that I compiled this code with a GCC compiler with these flags: -std=c++98 -Wall and did not notice any issues.
P.S. Please feel free to edit this post to improve any terminology I used here.
Edits
This question is asked in the spirit of other good practice questions, such as this question about deleting references.
A more extensive code example has been provided to clear up confusion. To be specific, the sample is now 63 lines which is an increase from the initial 30 lines. Thus, the variable names have been changed and therefore comments referencing Foo:p should apply to Foo:link.
It's a bad idea to start with, but a horrendous idea as a solution to null dereferences.
You don't hide null dereferences. Ever. Null dereferences are bugs, not errors. When bugs happens, all invariances in your program goes down the toilet and there can be no guarantee for any behaviour. Not allowing a bug to manifest itself immediately doesn't make the program correct in any sense, it only serves to obfuscate and make debugging significantly more difficult.
That aside, a structure pointing into itself is a gnarly can of worms. Consider your copy assignment
Foo& operator=(const Foo& rhs) {
if(this != &rhs)
return *this;
if(rhs->m_link != &rhs)
m_link = this;
else
m_link = rhs->m_link;
}
You now have to check whether you're pointing to yourself every time you copy because its value is possibly tied to its own identity.
As it turns out, there's plenty of cases where such checks are required. How is swap supposed to be implemented?
void swap(Foo& x, Foo& y) noexcept {
Foo* tx, *ty;
if(x.m_link == &x)
tx = &y;
else
tx = x.m_link;
if(y.m_link == &y)
ty = &x;
else
ty = y.m_link;
x.m_link = ty;
y.m_link = tx;
}
Suppose Foo has some sort of pointer/reference semantics, then your equality is now also non-trivial
bool operator==(const Foo& rhs) const {
return m_link == rhs.m_link || (m_link == this && rhs.m_link == &rhs);
}
Don't point into yourself. Just don't.
Foo is responsible for its own state. Especially pointers it exposes to its users.
If you expose a pointer in this fashion, as a public member, it is a very odd design decision. My gut has told me the last 30 odd years a pointer like this is not a responsible way to handle Foo's state.
Consider providing getters for this pointer instead.
Foo* getP() {
// create a safe pointer for user
// and indicate an error state. (exceptions might be an alternative)
}
Unless you share more context what Foo is, advice is hard to provide.
is there anything inherently evil with initializing a class pointer member variable to itself rather than NULL or nullptr?
No. But as you pointed out, there might be different considerations depending on the use case.
I'm not sure this would be relevant under most circumstances, but there are some instances where an object needs to hold a pointer of its own type, so its really just pertinent to those cases.
For instance, an element in a singly-linked list will have a pointer to the next element, so the last element in the list would normally have a NULL pointer to show there are no further elements. So using this example, the end element could instead point to itself instead of NULL to denote it is the last element. It really just depends on personal implementation preference.
Many times, you can end up obfuscating code needlessly when trying too hard to make it crash-proof. Depending on the situation, you might mask issues and make problems much harder to debug. For instance, going back to the singly-linked example, if the pointer-to-self initialization method is used, and a bug in the program attempts to access the next element from the end element in the list, the list will return the end element again. This would most likely cause the program to continue "traversing" the list for eternity. That might be harder to find/understand than simply letting the program crash and finding the culprit via debugging tools.

How to detect mid-function value changes to const parameter?

I ran into a nasty bug in some of my code. Here's the simplified version:
#include <iostream>
class A
{
public:
std::string s;
void run(const std::string& x)
{
// do some "read-only" stuff with "x"
std::cout << "x = " << x << std::endl;
// Since I passed X as a const referece, I expected string never to change
// but actually, it does get changed by clear() function
clear();
// trying to do something else with "x",
// but now it has a different value although I declared it as
// "const". This killed the code logic.
std::cout << "x = " << x << std::endl;
// is there some way to detect possible change of X here during compile-time?
}
void clear()
{
// in my actual code, this doesn't even happen here, but 3 levels deep on some other code that gets called
s.clear();
}
};
int main()
{
A a;
a.s = "test";
a.run(a.s);
return 0;
}
Basically, the code that calls a.run() use to be used for all kinds of strings in the past and at one point, I needed the exact value that object "a.s" had, so I just put a.s in there and then some time later noticed program behaving weird. I tracked it down to this.
Now, I understand why this is happening, but it looks like one of those really hard to trace and detect bugs. You see the parameter declared as const & and suddenly it's value changes.
Is there some way to detect this during compile-time? I'm using CLang and MSVC.
Thanks.
Is there some way to detect this during compile-time?
I don't think so. There is nothing inherently wrong about modifying a member variable that is referred by a const reference, so there is no reason for the compiler to warn about it. The compiler cannot read your mind to find out what your expectations are.
There are some usages where such wrong assumption could result in definite bugs such as undefined behaviour that could be diagnosed if identified. I suspect that identifying such cases in general would be quite expensive computationally, so I wouldn't rely on it.
Redesigning the interface could make that situation impossible For example following:
struct wrapper {
std::string str;
};
void run(const wrapper& x);
x.str will not alias the member because the member is not inside a wrapper.

C++ function about usage of return

Hello i am totally new to c++ programming, I had a question when we use a int function why do we have to use return command like we can use cout << sum << endl; and call the function in main(). but for return we have to do like cout << printSum();
Case 1:
#include <iostream>
using namespace std;
int addNumbers(int x, int y){
int sum = x + y;
cout << sum << endl;
}
int main() {
addNumbers(4, 5);
return 0;
}// without using return
Case 2:
#include <iostream>
using namespace std;
int addNumbers(int x, int y){
int sum = x + y;
return sum;
}
int main() {
cout << addNumbers(4, 5) << endl;
return 0;
}// return method
return statement is seemingly very basic, but in fact, one of the most puzzling aspects of procedural programming for freshman students in our local university.
What are the functions anyway?
Functions are the building blocks of procedural and functional programming languages. You might think of a function as of reusable, maintainable block of code responsible for some action. If you perform some operations often together, you might want to "pack" them inside a function. A "good" function is a little machine that takes some input data and gives back processed data.
In C++ you can "share" the results of the processing with "outside code" by returning a value, modifying a parameter (worse) or modifying the global state (worst).
In simple case, "returning" functions are direct analogs of math functions like y = f(x) (even the call syntax is very similar). As a programmer you just define what is x, what is y and how exactly f maps x to y.
Printing vs returning
Now, the printing into console (terminal) and returning are not the same. In simple words, printing is just showing the user some characters on the screen ("speaking with user"). Returning from function allows to receive the result from function ("speaking with outside code"), but it's invisible for the user unless you print it afterwards .
Different function layouts you may encounter while learning
(does not pretend to be an exhaustive list)
So, in your class or tutorial sometimes they teach you how to
(1) print an object directly within main()
int main() {
int i = 42 + 13;
std::cout << i << std::endl;
}
(2) print an object inside a function
void printSum(int a, int b) {
int i = a + b;
std::cout << i << std::endl;
}
int main() {
printSum(42, 13);
}
(3) return an object from the function and printing afterward
void sum(int a, int b) {
return a + b;
}
int main() {
int s = sum(42, 13);
std::cout << s << std::endl;
}
Obviously, (1) is not reusable at all: to change parameters you must intervene into the program logic. Also, if logic is something more than just a sum, main() function will grow quickly in size and become unmaintainable. Responsibilities will be scattered across the code, violating Single responsibility principle
(2) is better in all aspects: function encapsulates logic in a separate place in code, has distinctive name, can be changed separately from main function. Function can be called with different values, and changing arguments doesn't change the function.
But it still has 2 responsibilities: perform the calculation and output to the screen. What if you want not to print the result, but write it to a file? or send it via network? or just discard it? You will need to write another function. The logic inside this new function will be the same, thus introducing duplication and violating DRY principle. Also, (2) is not a pure function because it is modifying std::cout which is basically the global state (we say that this function has "side effects").
Also, you can think about (2) as if it was (1), but with whole program logic moved from main() into a separate function (that's why sometimes functions are called "subprograms").
(3) solves the multiple responsibility problem, by getting rid of printing (moving it into the "user code"). The function contains only pure logic. Now it's a number crunching machine. In main function you can decide what to do with the result of the calculations without touching the calculating function. It becomes a "black box". No need to worry how it is implemented, and no need to change anything, it just works.
In your post there is a shorter version of (3):
int main() {
std::cout << sum(42, 13) << std::endl;
}
The difference is that no temporary object int s being created, but return value is being written directly to std::cout (in fact, passed as a parameter to a function called operator<<)
In real life
In real-life programming, most of the time you will be writing functions like (3), that don't print anything. Printing to terminal is just a quick and simple way to visualize data. That's why you've been taught how to output to standard streams rather than writing to the files, network sockets or showing GUI widgets. Console printing is also very handy during debugging.
See also:
Call Stack - Wikipedia to better understand how function calls happen under the hood
The Definitive C++ Book Guide and List might be of help too.
P.S. There is a big deal of simplification in this post, thus the usage of lots of quoted terms.
The int function must return an int. If you don't want to return values, you can use a void function:
#include <iostream>
using namespace std;
void addNumbers(int x, int y){
int sum = x + y;
cout << sum << endl;
}
int main() {
addNumbers(4, 5);
return 0;
}
The main() must return int (standard), which means that the program ran successfully (0). You can even leave the return statement out of main(), but it will return zero (implicit).
#include <iostream>
using namespace std;
void addNumbers(int x, int y){
int sum = x + y;
cout << sum << endl;
}
int main() {
addNumbers(4, 5);
}
You have two slightly different scenario in both of your snippets.
Case 1: You are printing the value from the called function, before the function finishes execution. So, you don't (need to) return any value. So you don't need a return statement.
Case 2: You are expecting the function to return the value which will get printd. So, you need to have a the return in your called function.
This is because of the return types of the function. Some times you want a function to give you a result instead of just doing things with the information you put into it.
Just like a variable has a type such as int and string, a function wants a type for the item it will be returning.
Here is a website which might help you get into the basics of functions including information about return types.

C++ Cannot modify instance of a class

Inside this following thread routine :
void* Nibbler::moveRoutine(void* attr)
{
[...]
Nibbler* game = static_cast<Nibbler*>(attr);
while (game->_continue == true)
{
std::cout << game->_snake->_body.front()->getX() << std::endl; // display 0
std::cout << game->getDirection() << std::endl; // display 0
game->moveSnake();
std::cout << game->_snake->_body.front()->getX() << std::endl; // display 0
std::cout << game->getDirection() << std::endl; // display 42
}
}
[...]
}
I am calling the member function moveSnake(), which is supposed to modify the positions of the cells forming my snake's body.
void Nibbler::moveSnake()
{
[...]
std::cout << this->_snake->_body.front()->getX() << std::endl; // display 0
this->_snake->_body.front()->setX(3);
this->_direction = 42;
std::cout << this->_snake->_body.front()->getX() << std::endl; // display 3
[...]
}
Although my two coordinates are effectively modified inside my moveSnake() function, they are not anymore when I go back to my routine, where they keep their initial value. I don't understand why this is happening, since if I try to modify any other value of my class inside my moveSnake() function, the instance is modified and it will keep this value back in the routine.
The Nibbler class :
class Nibbler
{
public :
[...]
void moveSnake();
static void* moveRoutine(void*);
private :
[...]
int _direction
Snake* _snake;
IGraphLib* _lib;
pthread_t _moveThread;
...
};
The snake :
class Snake
{
public :
[...]
std::vector<Cell*> _body;
};
And finally the cell :
class Cell
{
public :
void setX(const int&);
void setY(const int&);
int getX() const;
int getY() const;
Cell(const int&, const int&);
~Cell();
private :
int _x;
int _y;
};
The cell.cpp code :
void Cell::setX(const int& x)
{
this->_x = x;
}
void Cell::setY(const int& y)
{
this->_y = y;
}
int Cell::getX() const
{
return this->_x;
}
int Cell::getY() const
{
return this->_y;
}
Cell::Cell(const int& x, const int& y)
{
this->_x = x;
this->_y = y;
}
Cell::~Cell()
{}
On its face, your question ("why does this member not get modified when it should?") seems reasonable. The design intent of what has been shown is clear enough and I think it matches what you have described. However, other elements of your program have conspired to make it not so.
One thing that may plague you is Undefined Behavior. Believe it or not, even the most experienced C++ developers run afoul of UB occasionally. Also, stack and heap corruption are extremely easy ways to cause terribly difficult-to-isolate problems. You have several things to turn to in order to root it out:
Debuggers (START HERE!)
with a simple single-step debugger, you can walk through your code and check your assumptions at every turn. Set a breakpoint, execute until, check the state of memory/variables, bisect the problem space again, iterate.
Static analysis
Starting with compiler warnings and moving up to lint and sophisticated commercial tools, static analysis can help point out "code smell" that may not necessarily be UB, but could be dead code or other places where your code likely doesn't do what you think it does.
Have you ignored the errors returned by the library/OS you're making calls into? In your case, it seems as if you're manipulating the memory directly, but this is a frequent source of mismatch between expectations and reality.
Do you have a rubber duck handy?
Dynamic analysis
Tools like Electric Fence/Purify/Valgrind(memcheck, helgrind)/Address-Sanitizer, Thread-Sanitizer/mudflaps can help identify areas where you've written to memory outside of what's been allocated.
If you haven't used a debugger yet, that's your first step. If you've never used one before, now is the time when you must take a brief timeout and learn how. If you plan on making it beyond this level, you will be thankful that you did.
If you're developing on Windows, there's a good chance you're using Visual Studio. The debugger is likely well-integrated into your IDE. Fire it up!
If you are developing on linux/BSD/OSX, you either have access to gdb or XCode, both of which should be simple enough for this problem. Read a tutorial, watch a video, do whatever it takes and get that debugger rolling. You may quickly discover that your code has been modifying one instance of Snake and printing out the contents of another (or something similarly frustrating).
If you can't duplicate the problem condition when you use a debugger, CONGRATULATIONS! You have found a heisenbug. It likely indicates a race condition, and that information alone will help you hone in on the source of the problem.

Calling functions from main() in c++

I came across a program with 10 header and 10 source files. I read in my text book that the functions are called from main. But how can I pass data to so many functions from main()?
Functions don't necessarily need to called from main. They can be called by other functions. For example:
int foo(int x)
{
return x*x;
}
int bar(int x)
{
return foo(x) + 1;
}
int main()
{
int a = bar(42);
std::cout << a << std::endl;
return 0;
}
Note that foo() is never called directly from main().
To my mind, this phrase isn't correct, but I guess what was meant to be said could be rephrased like "Every function or class method that you implement and use would be somehow called from your main() routine"
And somehow in this context would actually mean directly or indirectly - via other functions / function wrappers.
Anyway, the idea should be clear - any significant action that is done in your application is actually done using some function call from your main() routine, which is sometimes also called application root (try to think of your application as a tree of function calls and then your main() function would be right in the top of your tree).