How does a function detect string pointer vs string literal argument? - c++

I have encountered a function, such that it can differentiate between being called as
foo("bar");
vs
const char *bob = "bar";
foo(bob);
Possibilities I have thought of are:
Address of string: both arguments sat in .rdata section of the image. If I do both calls in the same program, both calls receive the same string address.
RTTI: no idea how RTTI can be used to detect such differences.
The only working example I could conjure up is:
void foo(char *msg)
{
printf("string literal");
}
void foo(const char *&msg)
{
printf("string pointer");
}
foo("bar"); // "string literal"
const char *soap = "bar";
foo(soap); // "string pointer"
I do not have access to the function's code, and the declarations in the header file only revealed one function declaration.

Here's another way to distinguish between a string literal and a pointer, based on the fact that string literals have array type, not pointer type:
#include <iostream>
void foo(char *msg)
{
std::cout << "non-const char*\n";
}
void foo(const char *&msg) // & needed, else this is preferred to the
// template function for a string literal
{
std::cout << "const char*\n";
}
template <int N>
void foo(const char (&msg)[N])
{
std::cout << "const char array reference ["<< N << "]\n";
}
int main() {
foo("bar"); // const char array reference [4]
}
But note that all of them (including your original function) can be "fooled" by passing something that isn't a string literal:
const char *soap = 0;
foo(soap);
char *b = 0;
foo(b);
const char a[4] = {};
foo(a);
There is no type in C++ which is unique to string literals. So, you can use the type to tell the difference between an array and a pointer, but not to tell the difference between a string literal and another array. RTTI is no use, because RTTI exists only for classes with at least one virtual member function. Anything else is implementation-dependent: there is no guarantee in the standard that string literals will occupy any particular region of memory, or that the same string literal used twice in a program (or even in a compilation unit) will have the same address. In terms of storage location, anything that an implementation can do with string literals, it is permitted also to do with my array a.

The function foo() in theory could use a macro to determine if the argument was a literal or not.
#define foo(X) (*#X == '"'
? foo_string_literal(X)
: foo_not_string_literal(X))

And what happens if you call it as:
const char bob[] = "bar";
foo(bob);
It's probably using some sort of distinction like that to make the determination.
EDIT: If there's only one function declaration in the header I can't conceive of any portable way the library could make that distinction.

Related

std::string_view with C Fuction

I'm using some C Leagacy Code within a C++ project.
On used C function looks like this
void Add_To_log(const * const char pString_1, const * const char pString_2, int number);
Now when I call this Functions from C++ Code like this
foo()
{
Add_To_log("my first string", "my second string", 2);
}
I get a compiler warning ISO C++ Forbids converting string to char.
So to get rid of this i thought of creating a c++ wrapper with string_view to avoid unnecessary coping of my strings
void CPP_Wrapper(const string_view& string1, const string_view& string2, int number)
{
Add_To_log(string1, string2, 2);
}
Now if i understood the reference correctly string_view does not necessarily contain a terminating null character with is essential for all c functions because it does not own the string object. It simply displays it.
However can i assume in my particular case that string1 and string2 are null terminated?
However can i assume in my particular case that string1 and string2 are null terminated?
No. You should not assume that a string view is null terminated. The wrapper function that you suggest is counter productive, if the C function expects a null-terminated string.
On used C function looks like this
void Add_To_log(const * const char pString_1, const * const char pString_2, int number);
That declaration is ill-formed. If you fix it to be something like:
void Add_To_log(const char * const pString_1, const char * const pString_2, int number)
then this call is well-formed:
Add_To_log("my first string", "my second string", 2); // No problem
std::string already has functions to provide a pointer to older C library functions
http://www.cplusplus.com/reference/string/string/data/
These provide a non-owning, read only pointer suitable to most C library functions that need read only access during the function call. I'm assuming the std::string has a greater lifetime than the function call, and that the pointer is used only during the function call. Or as the documentation I linked above states, "The pointer returned may be invalidated by further calls to other member functions that modify the object." (including the destructor obviously)
Also, take care to use c_str() in c++98 builds, as data() doesn't guarantee the terminating null until c++11, as noted in the documentaion link and by eerorika.
#include <stdio.h>
#include <string>
extern "C" {
void legacy_logger(const char * const pstr) {
printf("%s\n", pstr);
}
}
int main()
{
std::string message{ "This is the string." };
legacy_logger(message.data());
}

implicit convertion from constant character array to character array

in the snippet code bellow when I pass a literal string to the function it gives me a warning ISO C++ forbids converting a string constant to 'char*but when I assign a character array to this literal the warning will be gone. I know that the type of string literals in C++ is constant character array but the type of ch variable is just char.(not constant char)
#include <iostream>
using namespace std;
void func(char s[])
{
cout << s;
}
int main() {
char ch[] = "what";
func(ch);
func("what"); //gives warning
return 0;
}
and I have one question more. when I add const to input parameter type of func function there is no warning in this situation too even though I pass a character array to the function not const character array.I thought it should cause a warning for fucn(ch) call because ch is a character array not constant character array.
#include <iostream>
using namespace std;
void func(const char s[])
{
cout << s;
}
int main() {
char ch[] = "what";
func(ch);
func("what");
return 0;
}
const is not about matching exactly, but about what the function is doing to the parameter.
If you define a function with a const parameter, the function promises to not change the passed variable. Therefore, you can call it with constant strings as well as (changable) non-constant variables. The compiler will warn you if you try to modify the value inside the function (because you promised you wouldn't).
If you define a function with a non-constant parameter, it can change the parameter. Therefore, the compiler warns you if you pass a constant string, as that would lead to undefined behavior / crashes.

Why aren't string literals passed as references to arrays instead of opaque pointers?

In C++, the type of string literals is const char [N], where N, as std::size_t, is the number of characters plus one (the zero-byte terminator). They reside in static storage and are available from program initialization to termination.
Often, functions taking a constant string doesn't need the interface of std::basic_string or would prefer to avoid dynamic allocation; they may just need, for instance, the string itself and its length. std::basic_string, particularly, has to offer a way to be constructed from the language's native string literals. Such functions offer a variant that takes a C-style string:
void function_that_takes_a_constant_string ( const char * /*const*/ s );
// Array-to-pointer decay happens, and takes away the string's length
function_that_takes_a_constant_string( "Hello, World!" );
As explained in this answer, arrays decay to pointers, but their dimensions are taken away. In the case of string literals, this means that their length, which was known at compile-time, is lost and must be recalculated at runtime by iterating through the pointed memory until a zero-byte is found. This is not optimal.
However, string literals, and, in general, arrays, may be passed as references using template parameter deduction to keep their size:
template<std::size_t N>
void function_that_takes_a_constant_string ( const char (& s)[N] );
// Transparent, and the string's length is kept
function_that_takes_a_constant_string( "Hello, World!" );
The template function could serve as a proxy to another function, the real one, which would take a pointer to the string and its length, so that code exposure was avoided and the length was kept.
// Calling the wrapped function directly would be cumbersome.
// This wrapper is transparent and preserves the string's length.
template<std::size_t N> inline auto
function_that_takes_a_constant_string
( const char (& s)[N] )
{
// `s` decays to a pointer
// `N-1` is the length of the string
return function_that_takes_a_constant_string_private_impl( s , N-1 );
}
// Isn't everyone happy now?
function_that_takes_a_constant_string( "Hello, World!" );
Why isn't this used more broadly? In particular, why doesn't std::basic_string have a constructor with the proposed signature?
Note: I don't know how the proposed parameter is named; if you know how, please, suggest an edition to the question's title.
It's largely historical, in a sense. While you're correct that there's no real reason this can't be done (if you don't want to use your whole buffer, pass a length argument, right?) it's still true that if you have a character array it's usually a buffer not all of which you're using at any one time:
char buf[MAX_LEN];
Since this is usually how they're used, it seems needless or even risky to go to the trouble of adding a new basic_string constructor template for const CharT (&)[N].
The whole thing is pretty borderline though.
The trouble with adding such a templated overload is simple:
It would be used whenever the function is called with a static buffer of char-type, even if the buffer is not as a whole a string, and you really wanted to pass only the initial string (embedded zeroes are far less common than terminating zeroes, and using part of a buffer is very common): Current code rarely contains explicit decay from array to pointer to first element, using a cast or function-call.
Demo-code (On coliru):
#include <stdio.h>
#include <string.h>
auto f(const char* s, size_t n) {
printf("char* size_t %u\n", (unsigned)n);
(void)s;
}
auto f(const char* s) {
printf("char*\n");
return f(s, strlen(s));
}
template<size_t N> inline auto
f( const char (& s)[N] ) {
printf("char[&u]\n");
return f(s, N-1);
}
int main() {
char buffer[] = "Hello World";
f(buffer);
f(+buffer);
buffer[5] = 0;
f(buffer);
f(+buffer);
}
Keep in mind: If you talk about a string in C, it always denotes a 0-terminated string, while in C++ it can also denote a std::string, which is counted.
I believe this is being addressed in C++14 building on user defined string literals
http://en.cppreference.com/w/cpp/string/basic_string/operator%22%22s
#include <string>
int main()
{
//no need to write 'using namespace std::literals::string_literals'
using namespace std::string_literals;
std::string s2 = "abc\0\0def"; // forms the string "abc"
std::string s1 = "abc\0\0def"s; // form the string "abc\0\0def"
}
You can create helper class that will fix that without using overload for every function
struct string_view
{
const char* ptr;
size_t size;
template<size_t N>
string_view(const char (&s)[N])
{
ptr = s;
size = N;
}
string_view(const std::string& s)
{
ptr = s.data();
size = s.size() + 1; // for '\0' at end
}
};
void f(string_view);
main()
{
string_view s { "Hello world!" };
f("test");
}
You should expand this class for helper function (like begine and end) to simplify usage in your program.

Is it safe to overload char* and std::string?

I have just read about the overloading functions on a beginner book.
Just out of curiosity I 'd like to ask whether it is safe to overload between char* and std::string.
I played with the below code and get some result. But I was not sure whether it is an undefined behavior.
void foo(std::string str) {
cout << "This is the std::string version. " << endl;
}
void foo(char* str) {
cout << "This is the char* version. " << endl;
}
int main(int argc, char *argv[]) {
foo("Hello"); // result shows char* version is invoked
std::string s = "Hello";
foo(s); // result shows std::string version
return 0;
}
Yes, it's safe, as long as you make it const char*, and actually often useful. String literals cannot be converted to char* since C++11 (and it was deprecated before that).
The const char* overload will be picked for a string literal because a string literal is a const char[N] (where N is the number of characters). Overloads have a kind of priority ordering over which one will be picked when multiple would work. It's considered a better match to perform array-to-pointer conversion than to construct a std::string.
Why can overloading std::string and const char* be useful? If you had, for example, one overload for std::string and one for an bool, the bool would get called when you passed a string literal. That's because the bool overload is still considered a better match than constructing a std::string. We can get around this by providing a const char* overload, which will beat the bool overload, and can just forward to the std::string overload.
Short Answer: Perfectly safe. Consider the following uses:
foo("bar");//uses c string
foo(std::string("bar") );//uses std::string
char* bar = "bar";
foo(bar);//uses c string
std::string bar_string = "bar";
foo(bar_string);//uses std::string
foo(bar_string.c_str()); //uses c string
Word of warning, some compilers (namely those with c++11 enabled) require the const keyword in parameter specification in order to allow temporary strings to be used.
For instance, in order to get this:
foo("bar");
You need this:
void foo(const char* bar);

C++ deprecated conversion from string constant to 'char*'

I have a class with a private char str[256];
and for it I have an explicit constructor:
explicit myClass(char *func)
{
strcpy(str,func);
}
I call it as:
myClass obj("example");
When I compile this I get the following warning:
deprecated conversion from string constant to 'char*'
Why is this happening?
This is an error message you see whenever you have a situation like the following:
char* pointer_to_nonconst = "string literal";
Why? Well, C and C++ differ in the type of the string literal. In C the type is array of char and in C++ it is constant array of char. In any case, you are not allowed to change the characters of the string literal, so the const in C++ is not really a restriction but more of a type safety thing. A conversion from const char* to char* is generally not possible without an explicit cast for safety reasons. But for backwards compatibility with C the language C++ still allows assigning a string literal to a char* and gives you a warning about this conversion being deprecated.
So, somewhere you are missing one or more consts in your program for const correctness. But the code you showed to us is not the problem as it does not do this kind of deprecated conversion. The warning must have come from some other place.
The warning:
deprecated conversion from string constant to 'char*'
is given because you are doing somewhere (not in the code you posted) something like:
void foo(char* str);
foo("hello");
The problem is that you are trying to convert a string literal (with type const char[]) to char*.
You can convert a const char[] to const char* because the array decays to the pointer, but what you are doing is making a mutable a constant.
This conversion is probably allowed for C compatibility and just gives you the warning mentioned.
As answer no. 2 by fnieto - Fernando Nieto clearly and correctly describes that this warning is given because somewhere in your code you are doing (not in the code you posted) something like:
void foo(char* str);
foo("hello");
However, if you want to keep your code warning-free as well then just make respective change in your code:
void foo(char* str);
foo((char *)"hello");
That is, simply cast the string constant to (char *).
There are 3 solutions:
Solution 1:
const char *x = "foo bar";
Solution 2:
char *x = (char *)"foo bar";
Solution 3:
char* x = (char*) malloc(strlen("foo bar")+1); // +1 for the terminator
strcpy(x,"foo bar");
Arrays also can be used instead of pointers because an array is already a constant pointer.
Update: See the comments for security concerns regarding solution 3.
A reason for this problem (which is even harder to detect than the issue with char* str = "some string" - which others have explained) is when you are using constexpr.
constexpr char* str = "some string";
It seems that it would behave similar to const char* str, and so would not cause a warning, as it occurs before char*, but it instead behaves as char* const str.
Details
Constant pointer, and pointer to a constant. The difference between const char* str, and char* const str can be explained as follows.
const char* str : Declare str to be a pointer to a const char. This means that the data to which this pointer is pointing to it constant. The pointer can be modified, but any attempt to modify the data would throw a compilation error.
str++ ; : VALID. We are modifying the pointer, and not the data being pointed to.
*str = 'a'; : INVALID. We are trying to modify the data being pointed to.
char* const str : Declare str to be a const pointer to char. This means that point is now constant, but the data being pointed too is not. The pointer cannot be modified but we can modify the data using the pointer.
str++ ; : INVALID. We are trying to modify the pointer variable, which is a constant.
*str = 'a'; : VALID. We are trying to modify the data being pointed to. In our case this will not cause a compilation error, but will cause a runtime error, as the string will most probably will go into a read only section of the compiled binary. This statement would make sense if we had dynamically allocated memory, eg. char* const str = new char[5];.
const char* const str : Declare str to be a const pointer to a const char. In this case we can neither modify the pointer, nor the data being pointed to.
str++ ; : INVALID. We are trying to modify the pointer variable, which is a constant.
*str = 'a'; : INVALID. We are trying to modify the data pointed by this pointer, which is also constant.
In my case the issue was that I was expecting constexpr char* str to behave as const char* str, and not char* const str, since visually it seems closer to the former.
Also, the warning generated for constexpr char* str = "some string" is slightly different from char* str = "some string".
Compiler warning for constexpr char* str = "some string": ISO C++11 does not allow conversion from string literal to 'char *const'
Compiler warning for char* str = "some string": ISO C++11 does not allow conversion from string literal to 'char *'.
Tip
You can use C gibberish ↔ English converter to convert C declarations to easily understandable English statements, and vice versa. This is a C only tool, and thus wont support things (like constexpr) which are exclusive to C++.
In fact a string constant literal is neither a const char * nor a char* but a char[]. Its quite strange but written down in the c++ specifications; If you modify it the behavior is undefined because the compiler may store it in the code segment.
Maybe you can try this:
void foo(const char* str)
{
// Do something
}
foo("Hello")
It works for me
I solve this problem by adding this macro in the beginning of the code, somewhere. Or add it in <iostream>, hehe.
#define C_TEXT( text ) ((char*)std::string( text ).c_str())
I also got the same problem. And what I simple did is just adding const char* instead of char*. And the problem solved. As others have mentioned above it is a compatible error. C treats strings as char arrays while C++ treat them as const char arrays.
For what its worth, I find this simple wrapper class to be helpful for converting C++ strings to char *:
class StringWrapper {
std::vector<char> vec;
public:
StringWrapper(const std::string &str) : vec(str.begin(), str.end()) {
}
char *getChars() {
return &vec[0];
}
};
The following illustrates the solution, assign your string to a variable pointer to a constant array of char (a string is a constant pointer to a constant array of char - plus length info):
#include <iostream>
void Swap(const char * & left, const char * & right) {
const char *const temp = left;
left = right;
right = temp;
}
int main() {
const char * x = "Hello"; // These works because you are making a variable
const char * y = "World"; // pointer to a constant string
std::cout << "x = " << x << ", y = " << y << '\n';
Swap(x, y);
std::cout << "x = " << x << ", y = " << y << '\n';
}