forcing a function to be pure - c++

In C++ it is possible to declare that a function is const, which means, as far as I understand, that the compiler ensures the function does not modify the object. Is there something analogous in C++ where I can require that a function is pure? If not in C++, is there a language where one can make this requirement?
If this is not possible, why is it possible to require functions to be const but not require them to be pure? What makes these requirements different?
For clarity, by pure I want there to be no side effects and no use of variables other than those passed into the function. As a result there should be no file reading or system calls etc.
Here is a clearer definition of side effects:
No modification to files on the computer that the program is run on and no modification to variables with scope outside the function. No information is used to compute the function other than variables passed into it. Running the function should return the same thing every time it is run.
NOTE: I did some more research and encountered pure script
(Thanks for jarod42's comment)
Based on a quick read of the wikipedia article I am under the impression you can require functions be pure in pure script, however I am not completely sure.

Short answer: No. There is no equivalent keyword called pure that constrains a function like const does.
However, if you have a specific global variable you'd like to remain untouched, you do have the option of static type myVar. This will require that only functions in that file will be able to use it, and nothing outside of that file. That means any function outside that file will be constrained to leave it alone.
As to "side effects", I will break each of them down so you know what options you have:
No modification to files on the computer that the program is run on.
You can't constrain a function to do this that I'm aware. C++ just doesn't offer a way to constrain a function like this. You can, however, design a function to not modify any files, if you like.
No modification to variables with scope outside the function.
Globals are the only variables you can modify outside a function's scope that I'm aware of, besides anything passed by pointer or reference as a parameter. Globals have the option of being constant or static, which will keep you from modifying them, but, beyond that, there's really nothing you can do that I'm aware.
No information is used to compute the function other than variables passed into it.
Again, you can't constrain it to do so that I'm aware. However, you can design the function to work like this if you want.
Running the function should return the same thing every time it is run.
I'm not sure I understand why you want to constrain a function like this, but no. Not that I'm aware. Again, you can design it like this if you like, though.
As to why C++ doesn't offer an option like this? I'm guessing reusability. It appears that you have a specific list of things you don't want your function to do. However, the likelihood that a lot of other C++ users as a whole will need this particular set of constraints often is very small. Maybe they need one or two at a time, but not all at once. It doesn't seem like it would be worth the trouble to add it.
The same, however, cannot be said about const. const is used all the time, especially in parameter lists. This is to keep data from getting modified if it's passed by reference, or something. Thus, the compiler needs to know what functions modify the object. It uses const in the function declaration to keep track of this. Otherwise, it would have no way of knowing. However, with using const, it's quite simple. It can just constrain the object to only use functions that guarantee that it remains constant, or uses the const keyword in the declaration if the function.
Thus, const get's a lot of reuse.

Currently, C++ does not have a mechanism to ensure that a function has "no side effects and no use of variables other than those passed into the function." You can only force yourself to write pure functions, as mentioned by Jack Bashford. The compiler can't check this for you.
There is a proposal (N3744 Proposing [[pure]]). Here you can see that GCC and Clang already support __attribute__((pure)). Maybe it will be standardized in some form in the future revisions of C++.

In C++ it is possible to declare that a function is const, which means, as far as I understand, that the compiler ensures the function does not modify the object.
Not quite. The compiler will allow the object to be modified by (potentially ill-advised) use of const_cast. So the compiler only ensures that the function does not accidentally modify the object.
What makes these requirements [constant and pure] different?
They are different because one affects correct functionality while the other does not.
Suppose C is a container and you are iterating over its contents. At some point within the loop, perhaps you need to call a function that takes C as a parameter. If that function were to clear() the container, your loop will likely crash. Sure, you could build a loop that can handle that, but the point is that there are times when a caller needs assurance that the rug will not be pulled out from under it. Hence the ability to mark things const. If you pass C as a constant reference to a function, that function is promising to not modify C. This promise provides the needed assurance (even though, as I mentioned above, the promise can be broken).
I am not aware of a case where use of a non-pure function could similarly cause a program to crash. If there is no use for something, why complicate the language with it? If you can come up with a good use-case, maybe it is something to consider for a future revision of the language.
(Knowing that a function is pure could help a compiler optimize code. As far as I know, it's been left up to each compiler to define how to flag that, as it does not affect functionality.)

Related

Is there a way to pass an unknown number of arguments to a function?

Right now, I am trying to call a function in C++ through a Json object. The Json object would provide me with the name of the callee function and all the parameters. I will be able to extract the parameters using a for loop, but I am not sure how I can pass them in. For loop only allows me to pass arguments one by one, and I did not find a way to call a function besides passing in all the arguments at once.
I've made a temporary solution of:
if (parameter_count == 1)
func(param_1);
if (parameter_count == 2)
func(param_1, param_2);
...
This solution seems would not work for all cases since it can only work for functions with a limited number of arguments (depending on how many ifs I write). Is there a better way for this? Thanks!
EDIT: Sorry if I was being unclear. I do not know anything about func. I will be reading func from DLL based on its string name. Since I can't really change the function itself, I wouldn't be able to pass in a vector or struct directly.
Or perhaps did I have the wrong understanding? Are we allowed to pass in a single vector in place of a lot of parameters?
Sorry for making a mess through so many edits on this question. Brandon's solution with libffi works. Thanks!
So the problem as I understand it is that you have a void * pointer (which would come from your platform's DLL loading code) which "secretly" is a pointer to a function with a signature which is only known at runtime. You'd like to call this function at runtime with specified arguments.
Unfortunately, this is not possible to do cleanly with standard C++ alone. C++ cannot work with types that are not present in the program at compile-time, and since there is an infinite number of potential function signatures involved here there is no way to compile them all in.
What you'll want to do instead is manually set up the stack frame on your call stack and then jump to it, either via inline assembly or via some library or compiler extension that accomplishes this for your platform.
Here is a simple example of doing this via inline assembly. (To do this in general you will need to learn your platform's calling convention in detail, and needless to say this will constrain your program to the platform(s) you've implemented this for.)
I haven't actually tried it, but gcc has a compiler extension __builtin_apply that is apparently just meant to forward the arguments from one method wholesale to another but which could perhaps be used to accomplish something like this if you learned the (apparently opaque) description of the method.
[Update: Apparently I missed this in the comments, but Brandon mentioned libffi, a library which implements a bunch of platforms' calling conventions. This sounds like it might be the best option if you want to take this sort of approach.]
A final option would be to constrain the allowed signatures of your functions to a specified list, e.g. something like
switch(mySignature)
{
case VOID_VOID:
dynamic_cast<std::function<void(void)> *>(myPtr)();
break;
case VOID_INT:
dynamic_cast<std::function<void(int)> *>(myPtr)(my_int_arg_1);
break;
// ...
}
(Syntax of the above may not be 100% correct; I haven't tested it yet.) Whether this approach is sensible for your purposes depends on what you're doing.

Why is function with useless isolated `static` considered impure?

In Wikipedia article on Pure function, there is an example of impure function like this:
void f() {
static int x = 0;
++x;
}
With the remark of "because of mutation of a local static variable".
I wonder why is it impure? It's from unit type to unit type, so it always returns the same result for same input. And it has no side effects, because even despite it has static int variable, it's unobservable by any other function than this f(), so there is no observable mutation of global state that other functions might use.
If one argues that any global mutations are disallowed, regardless of whether they are observable or not, then no real life function can be considered pure ever, because any function would allocate its memory on stack, and allocation is impure, as it involves talking to MMU via OS, and allocated page might be residing in a different physical page, and so on, and so on.
So, why does this useless isolated static int makes function impure?
The result of a pure function is fully defined by its input arguments. Here, the result means not only the returned value, but also the effect in terms of the virtual machine defined by the C/C++ standard. In other words, if the function occasionally exhibits undefined behavior with the same input arguments, it cannot be considered pure (because the behavior is different from one call to another with the same input).
In the particular case with the static local variable, that variable may become the source of a data race if f is called concurrently in multiple threads. Data race means undefined behavior. Another possible source of UB is signed integer overflow, which may eventually happen.
The concept of pure functions seems to only matter in... functional languages? Correct me if I'm wrong. The wikipedia link you provide provides two references near the top, one of which is Professor Frisby's Mostly Adequate Guide to Functional Programming. Where there are several different qualifications for a pure function, including:
does not have any observable side effect
This matters because one of the things we can do to a pure function (as opposed to an impure function) is memoization (from the link above), or input/output caching. Pure functions are also testable, reasonable, and self documenting.
I guess memoization matters for the compiler, so asking if a function is "pure" can be considered equivalent to asking if the compiler can memoize the function. It seems like the concept of a static local variable that no other piece of code touches is just bad code, and the compiler should issue a warning about it. But should the compiler optimize it away? And should the compiler try to figure out if any given static local variable actually has no side effects?
It seems like it's just easier to design the compiler to always flag a function is impure if it has a static local, instead of writing logic to hem and haw over whether the function is memoizable or not. See a local static? Boom: no longer pure.
So from a compiler's point of view, it's impure.
What about the other properties of a pure function?
testable, reasonable, and self documenting
Tests are written by a person, usually, so I'd argue this function is testable. Although some automated test-writing software might again see that it's not memoizable, and just choose to ignore writing tests for it entirely. This hypothetical software might just skip anything with local statics. Again, hypothetically.
Is the code reasonable? Certainly not. Although I'm not sure how much this matters. It doesn't do anything. It makes it hard to understand. ("Why did Bob write the function this way? Is this a Magic/More Magic situation?").
Is the code self-documenting? Again, I'd say not. But again, this is degenerate example code.
I think the biggest argument against this being considered a pure function is that a functional language compiler would be perfectly reasonable if it just assumed it wasn't pure.
I think the biggest argument for this being considered a pure function is that we can look at it with our own eyeballs and see that there's obviously no outside behavior. Ignore the fact that signed overflow is undefined. Replace this with a datatype that has defined overflow, and is atomic. Well now there's no undefined behavior, but it still looks weird.
"In conclusion, I don't care whether it's pure or not."
Let me rephrase from my previous (above) conclusion.
I'm inclined to just scan a function for any mutation of static variables and call it a day. Boom, no longer pure.
Can the function be considered pure if we really think about it? Sure. But what's the point? If the definition of a pure function needs to be changed, argue for it to be changed. Seems like you think this is a pure function. That's fine, I see the merits in that. I also see the merits in considering it an impure function.
As much as this is a non-answer, it really depends on what you're using the definition of pure for. If it's writing a compiler? Probably want to use the more conservative definition of pure that allows false positives and excludes this function. If it's to impress a bunch of sophomore CS students while listening to Zep? Go for the definition that recognizes this has no side effects and call it a day.

How does c++11 resolve constexpr into assembly?

The basic question:
Edit: v-The question-v
class foo {
public:
constexpr foo() { }
constexpr int operator()(const int& i) { return int(i); }
}
Performance is a non-trivial issue. How does the compiler actually compile the above? I know how I want it to be resolved, but how does the specification actually specify it will be resolved?
1) Seeing the type int has a constexpr constructor, create a int object and compile the string of bytes that make the type from memory into the code directly?
2) Replace any calls to the overload with a call to the 'int's constructor that for some unknown reason int doesn't have constexpr constructors? (Inlining the call.)
3) Create a function, call the function, and have that function call 'int's consctructor?
Why I want to know, and how I plan to use the knowledge
edit:v-Background only-v
The real library I'm working with uses template arguments to decide how a given type should be passed between functions. That is, by reference or by value because the exact size of the type is unknown. It will be a user's responsibility to work within the limits I give them, but I want these limits to be as light and user friendly as I can sanely make them.
I expect a simple single byte character to be passed around in which case it should be passed by value. I do not bar 300mega-byte behemoth that does several minuets of recalculation every time a copy constructor is invoked. In which case passing by reference makes more sense. I have only a list of requirements that a type must comply with, not set cap on what a type can or can not do.
Why I want to know the answer to my question is so I can in good faith make a function object that accepts this unknown template, and then makes a decision how, when, or even how much of a object should be copied. Via a virtual member function and a pointer allocated with new is so required. If the compiler resolves constexpr badly I need to know so I can abandon this line of thought and/or find a new one. Again, It will be a user's responsibility to work within the limits I give them, but I want these limits to be as light and user friendly as I can sanely make them.
Edit: Thank you for your answers. The only real question was the second sentence. It has now been answered. Everything else If more background is required, Allow me to restate the above:
I have a template with four argument. The goal of the template is a routing protocol. Be that TCP/IP -unlikely- or node to node within a game -possible. The first two are for data storage. They have no requirement beyond a list of operators for each. The last two define how the data is passed within the template. By default this is by reference. For performance and freedom of use, these can be changed define to pass information by value at a user's request.
Each is expect to be a single byte long. They could in the case of metric for a EIGRP or OSFP like protocol the second template argument could be the compound of a dozen or more different variable. Each taking a non-trival time to copy or recompute.
For ease of use I investigate the use a function object that accepts the third and fourth template to handle special cases and polymorphic classes that would fail to function or copy correctly. The goal to not force a user to rebuild their objects from scratch. This would require planning for virtual function to preform deep copies, or any number of other unknown oddites. The usefulness of the function object depends on how sanely a compiler can be depended on not generate a cascade of function calls.
More helpful I hope?
The C++11 standard doesn't say anything about how constexpr will be compiled down to machine instructions. The standard just says that expressions that are constexpr may be used in contexts where a compile time constant value is required. How any particular compiler chooses to translate that to executable code is an implementation issue.
Now in general, with optimizations turned on you can expect a reasonable compiler to not execute any code at runtime for many uses of constexpr but there aren't really any guarantees. I'm not really clear on what exactly you're asking about in your example so it's hard to give any specifics about your use case.
constexpr expressions are not special. For all intents and purposes, they're basically const unless the context they're used in is constexpr and all variables/functions are also constexpr. It is implementation defined how the compiler chooses to handle this. The Standard never deals with implementation details because it speaks in abstract terms.

class forward declaration

Can I use forward declaration for a class in order to put it's definition and Implementation later in the program after it's been used (similar to what is done about functions)?
(I need to join multiple source files of a program into a file, and i want to put the classes' definitions and Implementations at the end of the file in order to main be at the top of the file.)
Yes you can, to a certain extent.
You have to realize that the C++ compiler is quite stupid, and doesn't read ahead. This is the reason why you have to use function prototypes (among some other reasons).
Now, a function isn't hard for compiler to resolve. It just looks at the return type of the function, and the types of the parameters of the function, and just assumes that the function is there, without any knowledge about what's actually inside the function, because it ultimately doesn't matter at that point.
However, the contents of the class do matter (the compiler needs to know the size of the class for example). But remember about the not reading ahead bit? When you forward define a class, the compiler doesn't know about what's in it, and therefore is missing a lot of information about it. How much space does is need to reserve for example?
Therefore, you can forward define classes, but you can't use them as value types. The only thing you can do with it (before it has been concretely declared), is use pointers to it (and use it as a function return type and template argument, as pointer out by
#Cheersandhth.-Alf).
If the thing you need to use isn't a pointer, you should probably use headers (read this if you want to learn more about that).
Without a class definition somewhere earlier, you can't use any class members, nor can you create any instances, but you can
use T* and T& types,
use T for formal return type and parameter declarations (yes even by value),
use T as a template parameter,
and possibly more, but the above is what occurred to me immediately.
So if that's all you need, then you're set to go with the forward-declarations.
However, all that the forward declaring buys you in the sketched situation is added work, maintaining the same code in two places, so it's difficult to see the point of it…
Oh, I just remembered, there is a particularly nasty Undefined Behavior associated with forward-declared incomplete types, namely using delete p where p is a pointer to incomplete type. This requires the destructor to be trivial. If the compiler is good then it warns, but don't count on it.
In summary, I would just place main at the very end of that code, where it belongs, avoiding all the problems.

C++ function-like value pass

first, sorry for the title but I really don´t know ho to summarize what I want to do. I am trying to write very simple "graphic" console game, just to learn basics of C++ and programming generally. When I have a function, I can pass value, or variable into that function while calling it. But I would like to do the same thing to the piece of code, but without using function. Becouse when function is called, program must actually jump to function, than return. So I thought, it would be more CPU-saving to just have that function built-in main, and just somehow select what that code should process. This could be done by passing value I want to process to some extra variable and let that "function" process that variable, but since I work with 2 dimensional fields, I need to use 2 for cycles to actually copy user-selected field to my work field. So what I want to know is, is there some way to do this more efficient? Again, please sorry my english, it´s hard to describe something in a language you don´t speak everyday.
You just described inline functions (including the function when used rather than jump and return) and references (use the caller's variables rather than copy into the function).
Inline functions just happen automatically when you turn the optimizer on, conditions permitting. Not something to worry about.
References are something you should read about in whatever book you are using to learn C++. They are declared like int foo( int &callers_var ); and can capture things like a field in a matrix.
As Roger said, never optimize until you have a functional program and can verify what is slow. That is the first rule of optimization.
Inline functions are the normal way to allow the compiler to avoid the overhead of a function call. However, it sounds like premature optimization here, and your efforts would be better spent elsewhere. A code example may help clarify what you want.