Related
I'm wondering if it's possible in C++ to declare a function parameter that must be a string literal? My goal is to receive an object that I can keep only the pointer to and know it won't be free()ed out from under me (i.e. has application lifetime scope).
For example, say I have something like:
#include <string.h>
struct Example {
Example(const char *s) : string(s) { }
const char *string;
};
void f() {
char *freeableFoo = strdup("foo");
Example e(freeableFoo); // e.string's lifetime is unknown
Example e1("literalFoo"); // e1.string is always valid
free(freeableFoo);
// e.string is now invalid
}
As shown in the example, when freeableFoo is free()ed the e.string member becomes invalid. This happens without Example's awareness.
Obviously we can get around this if Example copies the string in its constructor, but I'd like to not allocate memory for a copy.
Is a declaration possible for Example's constructor that says "you must pass a string literal" (enforced at compile-time) so Example knows it doesn't have to copy the string and it knows its string pointer will be valid for the application's lifetime?
In C++20 you can do it using a wrapper class that has a consteval converting constructor which takes a string literal:
struct literal_wrapper
{
template<class T, std::size_t N, std::enable_if_t<std::is_same_v<T, const char>>...>
consteval literal_wrapper(T (&s)[N]) : p(s) {}
char const* p;
};
The idea is that string literals have type const char[N] and we match this.
Then you can use this wrapper class in places where want to enforce passing a string literal:
void takes_literal(string_literal lit) {
// use lit.p here
}
You can call this as foo("foobar").
Note that this will also match static-storage const char[] arrays, like so:
const char array[] = {'a'};
takes_literal(array); // this compiles
Static arrays have most of the same characteristics as string literals, however, e.g., indefinite storage duration, which may work for you.
It does not match local arrays because the decayed pointer value is not a constant expression (that's where consteval comes in).
This answer is almost directly copied from the first variant suggested in C.M.'s comment on the question.
if it's possible in C++ to declare a function parameter that must be a string literal?
In C++ string literals have type char const[N], so that you can declare a parameter to be of such a type:
struct Example {
template<size_t N>
Example(char const(&string_literal)[N]); // In C++20 declare this constructor consteval.
template<size_t N>
Example(char(&)[N]) = delete; // Reject non-const char[N] arguments.
};
However, not every char const[N] is a string literal. One can have local variables and data members of such types. In C++20 you can declare the constructor as consteval to make it reject non-literal arguments for string_literal parameter.
Conceptually, you'd like to determine the storage duration of the argument to a constructor/function parameter. Or, more precisely, whether the argument has a longer lifetime than Example::string reference to it. C++ doesn't provide that, C++20 consteval is still a poor-man's proxy for that.
gcc extension __builtin_constant_p detetmines whether an expression is a compile-time constant which is widely used in preprocessor macros in Linux kernel source code. However, it can only evaluate to 1 on expressions, but never on functions' parameters, so that its use is limited to preprocessor macros.
The traditional solution for the problem of different object lifetimes has been organizing objects into a hierarchy, where objects at lower levels have smaller lifetimes than objects at higher levels, and hence, an object can always have a plain pointer to an object at a higher level of hierarchy valid. This approach is somewhat advanced, labour intensive and error prone, but it totally obviates the need for any garbage collection or smart-pointers, so that it's only used in ultra critical applications where no cost is too high. The opposite extreme of this approach is using std::shared_ptr/std::weak_ptr for everything which snowballs into maintenance nightmare pretty rapidly.
Just make the constructor explicitly taking rvalue:
struct Example {
Example(const char*&& s) : string(s) { }
const char* string;
};
TL;DR:
Why can't templated functions access the same conversions that non-templated functions can?
struct A {
A(std::nullptr_t) {}
};
template <typename T>
A makeA(T&& arg) {
return A(std::forward<T>(arg));
}
void foo() {
A a1(nullptr); //This works, of course
A a2(0); //This works
A a3 = makeA(0); //This does not
}
Background
I'm trying to write some templated wrapper classes to use around existing types, with the goal of being drop-in replacements with minimal need to rewrite existing code that uses the now-wrapped values.
One particular case I can't get my head around is as follows: we have a class which can be constructed from std::nullptr_t (here called A), and as such, there's plenty of places in the code base where someone has assigned zero to an instance.
However, the wrapper cannot be assigned a zero, despite forwarding the constructors. I have made a very similar example that reproduces the issue without using an actual wrapper class - a simple templated function is sufficient to show the issue.
I would like to allow that syntax of being able to assign zero to continue to be allowed - it isn't my favourite, but minimising friction to moving to newer code is often a necessity just to get people on board with using them.
I also don't want to add a constructor that takes any int other than zero because that's very much absurd, was never allowed before, and it should continue to be caught at compile time.
If such a thing is not possible, it would satisfy me to find an explanation, because with as much as I know so far, it makes no sense to me.
This example has the same behaviour in VC++ (Intellisense seems to be OK with it though...), Clang, and GCC. Ideally a solution will also work in all 3 (4 with intellisense) compilers.
A more directly applicable example is follows:
struct A {
A(){}
A(std::nullptr_t) {}
};
template <typename T>
struct Wrapper {
A a;
Wrapper(const A& a):a (a) {}
template <typename T>
Wrapper(T&& t): a(std::forward<T>(t)){}
Wrapper(){}
};
void foo2() {
A a1;
a1 = 0; // This works
Wrapper<A> a2;
a2 = 0; //This does not
}
Why has the compiler decided to treat the zero as an int?
Because it is an integer.
The literal 0 is a literal. Literals get to do funny things. String literals can be converted into const char* or const char[N], where N is the length of the string + NUL terminator. The literal 0 gets to do funny things too; it can be used to initialize a pointer with a NULL pointer constant. And it can be used to initialize an object of type nullptr_t. And of course, it can be used to create an integer.
But once it gets passed as a parameter, it can't be a magical compiler construct anymore. It becomes an actual C++ object with a concrete type. And when it comes to template argument deduction, it gets the most obvious type: int.
Once it becomes an int, it stops being a literal 0 and behaves exactly like any other int. Not unless it is used in a constexpr context (like your int(0)), where the compiler can figure out that it is indeed a literal 0 and therefore can take on its magical properties. Function parameters are never constexpr, and thus they cannot partake in this.
See [conv.ptr]/1:
A null pointer constant is an integer literal with value zero or a prvalue of type std::nullptr_t. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type [...]
So the integer literal 0 can be converted to a null pointer. But if you attempt to convert some other integer value, that is not a literal, to a pointer type then the above quote does not apply. In fact there is no other implicit conversion from integer to pointer (since none such is listed in [conv.ptr]), so your code fails.
Note: Explicit conversion is covered by [expr.reinterpret.cast]/5.
It's interesting that using the function name as a function pointer is equivalent to applying the address-of operator to the function name!
Here's the example.
typedef bool (*FunType)(int);
bool f(int);
int main() {
FunType a = f;
FunType b = &a; // Sure, here's an error.
FunType c = &f; // This is not an error, though.
// It's equivalent to the statement without "&".
// So we have c equals a.
return 0;
}
Using the name is something we already know in array. But you can't write something like
int a[2];
int * b = &a; // Error!
It seems not consistent with other parts of the language. What's the rationale of this design?
This question explains the semantics of such behavior and why it works. But I'm interested in why the language was designed this way.
What's more interesting is the function type can be implicitly converted to pointer to itself when using as a parameter, but will not be converted to a pointer to itself when using as a return type!
Example:
typedef bool FunctionType(int);
void g(FunctionType); // Implicitly converted to void g(FunctionType *).
FunctionType h(); // Error!
FunctionType * j(); // Return a function pointer to a function
// that has the type of bool(int).
Since you specifically ask for the rationale of this behavior, here's the closest thing I can find (from the ANSI C90 Rationale document - http://www.lysator.liu.se/c/rat/c3.html#3-3-2-2):
3.3.2.2 Function calls
Pointers to functions may be used either as (*pf)() or as pf().
The latter construct, not sanctioned in the Base Document, appears in
some present versions of C, is unambiguous, invalidates no old code,
and can be an important shorthand. The shorthand is useful for
packages that present only one external name, which designates a
structure full of pointers to object s and functions : member
functions can be called as graphics.open(file) instead of
(*graphics.open)(file). The treatment of function designators can
lead to some curious , but valid , syntactic forms . Given the
declarations :
int f ( ) , ( *pf ) ( ) ;
then all of the following expressions are valid function calls :
( &f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();
The first expression on each line was discussed in the previous
paragraph . The second is conventional usage . All subsequent
expressions take advantage of the implicit conversion of a function
designator to a pointer value , in nearly all expression contexts .
The Committee saw no real harm in allowing these forms ; outlawing
forms like (*f)(), while still permitting *a (for int a[]),
simply seemed more trouble than it was worth .
Basically, the equivalence between function designators and function pointers was added to make using function pointers a little more convenient.
It's a feature inherited from C.
In C, it's allowed primarily because there's not much else the name of a function, by itself, could mean. All you can do with an actual function is call it. If you're not calling it, the only thing you can do is take the address. Since there's no ambiguity, any time a function name isn't followed by a ( to signify a call to the function, the name evaluates to the address of the function.
That actually is somewhat similar to one other part of the language -- the name of an array evaluates to the address of the first element of the array except in some fairly limited circumstances (being used as the operand of & or sizeof).
Since C allowed it, C++ does as well, mostly because the same remains true: the only things you can do with a function are call it or take its address, so if the name isn't followed by a ( to signify a function call, then the name evaluates to the address with no ambiguity.
For arrays, there is no pointer decay when the address-of operator is used:
int a[2];
int * p1 = a; // No address-of operator, so type is int*
int (*p2)[2] = &a; // Address-of operator used, so type is int (*)[2]
This makes sense because arrays and pointers are different types, and it is possible for example to return references to arrays or pass references to arrays in functions.
However, with functions, what other type could be possible?
void foo(){}
&foo; // #1
foo; // #2
Let's imagine that only #2 gives the type void(*)(), what would the type of &foo be? There is no other possibility.
when I do this (in my class)
public:
Entity()
{
re_sprite_eyes = new sf::Sprite();
re_sprite_hair = new sf::Sprite();
re_sprite_body = new sf::Sprite();
}
private:
sf::Sprite* re_sprite_hair;
sf::Sprite* re_sprite_body;
sf::Sprite* re_sprite_eyes;
Everything works fine. However, if I change the declarations to this:
private:
sf::Sprite* re_sprite_hair, re_sprite_body, re_sprite_eyes;
I get this compiler error:
error: no match for 'operator=' in '((Entity*)this)->Entity::re_sprite_eyes = (operator new(272u), (<statement>, ((sf::Sprite*)<anonymous>)))
And then it says candidates for re_sprite_eyes are sf::Sprite objects and/or references.
Why does this not work? Aren't the declarations the same?
sf::Sprite* re_sprite_hair, re_sprite_body, re_sprite_eyes;
Does not declare 3 pointers - it is one pointer and 2 objects.
sf::Sprite* unfortunately does not apply to all the variables declared following it, just the first. It is equivalent to
sf::Sprite* re_sprite_hair;
sf::Sprite re_sprite_body;
sf::Sprite re_sprite_eyes;
You want to do:
sf::Sprite *re_sprite_hair, *re_sprite_body, *re_sprite_eyes;
You need to put one star for each variable. In such cases I prefer to keep the star on the variable's side, rather than the type, to make exactly this situation clear.
In both C and C++, the * binds to the declarator, not the type specifier. In both languages, declarations are based on the types of expressions, not objects.
For example, suppose you have a pointer to an int named p, and you want to access the int value that p points to; you do so by dereferencing the pointer with the unary * operator, like so:
x = *p;
The type of the expression *p is int; thus, the declaration of p is
int *p;
This is true no matter how many pointers you declare within the same declaration statement; if q and r also need to be declared as pointers, then they also need to have the unary * as part of the declarator:
int *p, *q, *r;
because the expressions *q and *r have type int. It's an accident of C and C++ syntax that you can write T *p, T* p, or T * p; all of those declarations will be interpreted as T (*p).
This is why I'm not fond of the C++ style of declaring pointer and reference types as
T* p;
T& r;
because it implies an incorrect view of how C and C++ declaration syntax works, leading to the exact kind of confusion that you just experienced. However, I've written enough C++ to realize that there are times when that style does make the intent of the code clearer, especially when defining container types.
But it's still wrong.
This is a (two years late) response to Lightness Races in Orbit (and anyone else who objects to my labeling the T* p convention as "wrong")...
First of all, you have the legion of questions just like this one that arise specifically from the use of the T* p convention, and how it doesn't work like people expect. How many questions on this site are on the order of "why doesn't T* p, q declare both p and q as pointers?"
It introduces confusion - that by itself should be enough to discourage its use.
But beyond that, it's inconsistent. You can't separate array-ness or function-ness from the declarator, why should you separate pointer-ness from it?
"Well, that's because [] and () are postfix operators, while * is unary". Yes, it is, so why aren't you associating the operator with its operand? In the declaration T* p, T is not the operand of *, so why are we writing the declaration as though it is?
If a is "an array of pointers", why should we write T* a[N]? If f is "a function returning a pointer", why should we write T* f()? The declarator system makes more sense and is internally consistent if you write those declarations as T *a[N] and T *f(). That should be obvious from the fact that I can use T as a stand-in for any type (indeed, for any sequence of declaration specifiers).
And then you have pointers to arrays and pointers to functions, where the * must be explicitly bound to the declarator1:
T (*a)[N];
T (*f)();
Yes, pointer-ness is an important property of the thing you're declaring, but so are array-ness and function-ness, and emphasizing one over the other creates more problems than it solves. Again, as this question shows, the T* p convention introduces confusion.
Because * is unary and a separate token on its own you can write T* p, T *p, T*p, and T * p and they'll all be accepted by the compiler, but they will all be interpreted as T (*p). More importantly, T* p, q, r will be interpreted as T (*p), q, r. That interpretation is more obvious if you write T *p, q, r. Yeah, yeah, yeah, "declare only one thing per line and it won't be a problem." You know how else to not make it a problem? Write your declarators properly. The declarator system itself will make more sense and you will be less likely to make mistake.
We're not arguing over an "antique oddity" of the language, it's a fundamental component of the language grammar and its philosophy. Pointer-ness is a property of the declarator, just like array-ness and function-ness, and pretending it's somehow not just leads to confusion and makes both C and C++ harder to understand than they need to be.
I would argue that making the dereference operator unary as opposed to postfix was a mistake2, but that's how it worked in B, and Ritchie wanted to keep as much of B as possible. I will also argue that Bjarne's promotion of the T* p convention is a mistake.
At this point in the discussion, somebody will suggest using a typedef liketypedef T arrtype[N];
arrtype* p;
which just totally misses the point and earns the suggester a beating with the first edition of "C: The Complete Reference" because it's big and heavy and no good for anything else.
Writing T a*[N]*() as opposed to T (*(*a)[N])() is definitely less eye-stabby and scans much more easily.
In C++11 you have a nice little workaround, which might be better than shifting spaces back and forth:
template<typename T> using type=T;
template<typename T> using func=T*;
// I don't like this style, but type<int*> i, j; works ok
type<int*> i = new int{3},
j = new int{4};
// But this one, imho, is much more readable than int(*f)(int, int) = ...
func<int(int, int)> f = [](int x, int y){return x + y;},
g = [](int x, int y){return x - y;};
Another thing that may call your attention is the line:
int * p1, * p2;
This declares the two pointers used in the previous example. But notice that there is an asterisk (*) for each pointer, in order for both to have type int* (pointer to int). This is required due to the precedence rules. Note that if, instead, the code was:
int * p1, p2;
p1 would indeed be of type int*, but p2 would be of type int. Spaces do not matter at all for this purpose. But anyway, simply remembering to put one asterisk per pointer is enough for most pointer users interested in declaring multiple pointers per statement. Or even better: use a different statemet for each variable.
From http://www.cplusplus.com/doc/tutorial/pointers/
The asterisk binds to the pointer-variable name. The way to remember this is to notice that in C/C++, declarations mimic usage.
The pointers might be used like this:
sf::Sprite *re_sprite_body;
// ...
sf::Sprite sprite_bod = *re_sprite_body;
Similarly,
char *foo[3];
// ...
char fooch = *foo[1];
In both cases, there is an underlying type-specifier, and the operator or operators required to "get to" an object of that type in an expression.
It's interesting that using the function name as a function pointer is equivalent to applying the address-of operator to the function name!
Here's the example.
typedef bool (*FunType)(int);
bool f(int);
int main() {
FunType a = f;
FunType b = &a; // Sure, here's an error.
FunType c = &f; // This is not an error, though.
// It's equivalent to the statement without "&".
// So we have c equals a.
return 0;
}
Using the name is something we already know in array. But you can't write something like
int a[2];
int * b = &a; // Error!
It seems not consistent with other parts of the language. What's the rationale of this design?
This question explains the semantics of such behavior and why it works. But I'm interested in why the language was designed this way.
What's more interesting is the function type can be implicitly converted to pointer to itself when using as a parameter, but will not be converted to a pointer to itself when using as a return type!
Example:
typedef bool FunctionType(int);
void g(FunctionType); // Implicitly converted to void g(FunctionType *).
FunctionType h(); // Error!
FunctionType * j(); // Return a function pointer to a function
// that has the type of bool(int).
Since you specifically ask for the rationale of this behavior, here's the closest thing I can find (from the ANSI C90 Rationale document - http://www.lysator.liu.se/c/rat/c3.html#3-3-2-2):
3.3.2.2 Function calls
Pointers to functions may be used either as (*pf)() or as pf().
The latter construct, not sanctioned in the Base Document, appears in
some present versions of C, is unambiguous, invalidates no old code,
and can be an important shorthand. The shorthand is useful for
packages that present only one external name, which designates a
structure full of pointers to object s and functions : member
functions can be called as graphics.open(file) instead of
(*graphics.open)(file). The treatment of function designators can
lead to some curious , but valid , syntactic forms . Given the
declarations :
int f ( ) , ( *pf ) ( ) ;
then all of the following expressions are valid function calls :
( &f)(); f(); (*f)(); (**f)(); (***f)();
pf(); (*pf)(); (**pf)(); (***pf)();
The first expression on each line was discussed in the previous
paragraph . The second is conventional usage . All subsequent
expressions take advantage of the implicit conversion of a function
designator to a pointer value , in nearly all expression contexts .
The Committee saw no real harm in allowing these forms ; outlawing
forms like (*f)(), while still permitting *a (for int a[]),
simply seemed more trouble than it was worth .
Basically, the equivalence between function designators and function pointers was added to make using function pointers a little more convenient.
It's a feature inherited from C.
In C, it's allowed primarily because there's not much else the name of a function, by itself, could mean. All you can do with an actual function is call it. If you're not calling it, the only thing you can do is take the address. Since there's no ambiguity, any time a function name isn't followed by a ( to signify a call to the function, the name evaluates to the address of the function.
That actually is somewhat similar to one other part of the language -- the name of an array evaluates to the address of the first element of the array except in some fairly limited circumstances (being used as the operand of & or sizeof).
Since C allowed it, C++ does as well, mostly because the same remains true: the only things you can do with a function are call it or take its address, so if the name isn't followed by a ( to signify a function call, then the name evaluates to the address with no ambiguity.
For arrays, there is no pointer decay when the address-of operator is used:
int a[2];
int * p1 = a; // No address-of operator, so type is int*
int (*p2)[2] = &a; // Address-of operator used, so type is int (*)[2]
This makes sense because arrays and pointers are different types, and it is possible for example to return references to arrays or pass references to arrays in functions.
However, with functions, what other type could be possible?
void foo(){}
&foo; // #1
foo; // #2
Let's imagine that only #2 gives the type void(*)(), what would the type of &foo be? There is no other possibility.