Decide if const char* is a string literal or a variable - c++

Is there any simple method to detect, if the parameter passed to a function(const char *argument) was a constant literal or a variable?
I'm trying to fix errors in some code, which is filled with IsBadWritePtr calls, which throw access violation exceptions if the parameter was a constant literal.
This was a terrible design stupidity but now I'm not allowed to change the awkward behavior.

You can add a different overload that will be a better match for string literals. This is not really science but just heuristics:
void f(const char* p); // potential literal
void f(char *p); // pointer to non-const
Another idea would be taking advantage that literals are really arrays:
template <int N>
void f(const char (&_)[N]); // potential literal
Note that they don't quite detect literal vs. not literal, but rather some of the other features. const char* p = createANewString(); f(p); will resolve to f(const char*), and const char x[] = { 'A', 'b', 'c', '\0' }; will resolve to the template. Neither of them are literals, but you probably don't want to modify either.
Once you make that change, is should be simple to find out where each of the overloads is called.
This all works on the premise that the main function should not take the argument as const char* if it modifies it internally, and that the issue you are facing is because for backwards compatibility your compiler is allowing the call to a function that takes a pointer to non-const with a literal...

I don't think there is an way to detect that not at-least without using some hackery.
Since the interface takes a const char * the responsibility of the function is to not modify the passed string anyways. You need to modify the implementation because it is simply incorrect.

VirtualQuery can be used to detect if the address is writable, read-only, or inaccessible. Examine the State and Protect members of the returned MEMORY_BASIC_INFORMATION structure to see if the memory is accessible and has the access you need.

One VERY hackish way would involve checking if the pointer is in the .rdata segment.
Use dumpbin /headers after the build to retrieve the offset and length of the .rdata section, or parse the PE headers yourself. naturally, this is toolchain specific and generally a bad idea. Also, if the code needs to interoperate with DLLs, you'd have to check several executables and several .rdata segments.

Related

c++20 handling of string literals

We’re updating a project to c++20, and are running into errors where we pass string literals into functions which take char *. I know this has been changed to make code more safe, but we are interfacing with libraries which we cannot change.
I’d rather not disable the strict treatment of literals via compiler flags, so is there a good way to wrap these literals just in these particular cases?
I was thinking of an inline function, that was named something specific to the library, that internally would use const_cast. That way later if we want to change the code because the library gets updated, we know exactly where to look.
Any other ideas?
"Any other ideas?"
static char my_string[] = "string";
...
//elsewhere in the code
library_function(my_string);
The only difference between passing a string like that, and passing a string literal is the section of the assembly the data is stored in.
A string literal is stored in .text, a non-modifiable section.
The non-const string will be stored in .data.
If you really, really care if you're passing a function a pointer to .text or a pointer to .data, and you really, really, trust the library to not modify the parameter now and for ever, then you can certainly cast away the const-ness of your string literals.
Ignoring the fact that documentation lags behind implementation, even if we could believe the documentation promise to not modify its inputs, if it doesn't enforce it through the interface, at any time, on purpose or on accident, that input could be modified.
The following string literal creates a std::string for each literal and implicitly converts to char*.
#include <string>
struct stringlitwrapper
{
constexpr stringlitwrapper(const char* c) : s(c) {};
operator char*() { return s.data(); }
std::string s;
};
constexpr stringlitwrapper operator"" _w (const char* c, std::size_t n)
{
return stringlitwrapper(c);
}
void libfunction(char* param) {
// uses non-const char* as parameter
}
int main() {
libfunction("string literal"_w);
return 0;
}
For compilers, which do not support constexpr here (e.g. msvc does, clang not), leave both constexpr away.
By internally storing the literal as non-const string, there is no undefined behaviour involved.
(The library function of course should not overwrite at all or at least not write over the end of the string.)
To prevent heap allocations, the std::string could be replaced by a char array with fixed (maximal) size.

C++ const cast, unsure if this is secure

It maybe seems to be a silly question but i really need to clarify this:
Will this bring any danger to my program?
Is the const_cast even needed?
If i change the input pointers values in place will it work safely with std::string or will it create undefined behaviour?
So far the only concern is that this could affect the string "some_text" whenever I modify the input pointer and makes it unusable.
std::string some_text = "Text with some input";
char * input = const_cast<char*>(some_text.c_str());
Thanks for giving me some hints, i would like to avoid the shoot in my own foot
As an example of evil behavior: the interaction with gcc's Copy On Write implementation.
#include <string>
#include <iostream>
int main() {
std::string const original = "Hello, World!";
std::string copy = original;
char* c = const_cast<char*>(copy.c_str());
c[0] = 'J';
std::cout << original << "\n";
}
In action at ideone.
Jello, World!
The issue ? As the name implies, gcc's implementation of std::string uses a ref-counted shared buffer under the cover. When a string is modified, the implementation will neatly check if the buffer is shared at the moment, and if it is, copy it before modifying it, ensuring that other strings sharing this buffer are not affected by the new write (thus the name, copy on write).
Now, with your evil program, you access the shared buffer via a const-method (promising not to modify anything), but you do modify it!
Note that with MSVC's implementation, which does not use Copy On Write, the behavior would be different ("Hello, World!" would be correctly printed).
This is exactly the essence of Undefined Behavior.
To modify an inherently const object by casting away its constness using const_cast is an Undefined Behavior.
string::c_str() returns a const char *, i.e: a pointer to a constant c-style string. Technically, modifying this will result in Undefined Behavior.
Note, that the use of const_cast is when you have a const pointer to a non const data and you wish to modify the non-constant data.
Simply casting will not bring forth an undefined behavior. Modifying the data pointed at, however, will. (Also see ISO 14882:98 5.2.7-7).
If you want a pointer to modifiable data, you can have a
std::vector<char> wtf(str.begin(), str.end());
char* lol= &wtf[0];
The std::string manages it's own memory internally, which is why it returns a pointer to that memory directly as it does with the c_str() function. It makes sure it's constant so that your compiler will warn you if you try to do modifiy it.
Using const_cast in that way literally casts away such safety and is only an arguably acceptable practice if you are absolutely sure that memory will not be modified.
If you can't guarantee this then you must copy the string and use the copy.; it's certainly a lot safer to do this in any event (you can use strcpy).
See the C++ reference website:
const char* c_str ( ) const;
"Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
A terminating null character is automatically appended.
The returned array points to an internal location with the required storage space for this sequence of characters plus its terminating null-character, but the values in this array should not be modified in the program and are only guaranteed to remain unchanged until the next call to a non-constant member function of the string object."
Yes, it will bring danger, because
input points to whatever c_str happens to be right now, but if some_text ever changes or goes away, you'll be left with a pointer that points to garbage. The value of c_str is guaranteed to be valid only as long as the string doesn't change. And even, formally, only if you don't call c_str() on other strings too.
Why do you need to cast away the const? You're not planning on writing to *input, are you? That is a no-no!
This is a very bad thing to do. Check out what std::string::c_str() does and agree with me.
Second, consider why you want a non-const access to the internals of the std::string. Apparently you want to modify the contents, because otherwise you would use a const char pointer. Also you are concerned that you don't want to change the original string. Why not write
std::string input( some_text );
Then you have a std::string that you can mess with without affecting the original, and you have std::string functionality instead of having to work with a raw C++ pointer...
Another spin on this is that it makes code extremely difficult to maintain. Case in point: a few years ago I had to refactor some code containing long functions. The author had written the function signatures to accept const parameters but then was const_casting them within the function to remove the constness. This broke the implied guarantee given by the function and made it very difficult to know whether the parameter has changed or not within the rest of the body of the code.
In short, if you have control over the string and you think you'll need to change it, make it non-const in the first place. If you don't then you'll have to take a copy and work with that.
it is UB.
For example, you can do something like this this:
size_t const size = (sizeof(int) == 4 ? 1024 : 2048);
int arr[size];
without any cast and the comiler will not report an error. But this code is illegal.
The morale is that you need consider action each time.

Function that takes a char array as a parameter

There is a function I want to use that takes char str[] as a parameter. I want to call the function giving a string input.
void someFunction (char str[]) {
/* ... */
}
// Works.
someFunction("1010101");
// Does not work.
string someString;
someFunction(someString);
How can I get the second call to work?
EDIT: I cannot change the function's input parameters.
Depends on the nature of the string manipulations. If you read but don't write the string, change the prototype to const char str[] and use someString.c_str(), like others are suggesting.
If you change the characters but not the length of the string, use &*someString.begin().
If you extend/truncate the string, it's easier to pass a string& and work in terms of the string object. Less trouble, honestly.
You should be able to do:
someFunction(const_cast<char*>(someString.c_str()));
Although I'm not sure what will happen if str gets modified.
It's probably best if you just modify the original function to take a different parameter type.
What you want for std::string is void someFunction(std::string& str);
There's a reason for the issue -- a std::string's data is not guaranteed to be contiguous memory (at least, before C++11). Therefore, manipulating its buffer as a contiguous allocation (char[]) is a very bad idea.
casting away the const of std::string::c_str() is also a bad idea. One immediate problem you may face is that a std::string implementation may share backing string allocations with other std::string instances (copy-on-write), and you will end up modifying the values of other std::strings. Of course, there are many other bad things that could go wrong in their own implementation-defined ways -- the standard left this very flexible for the implementors of standard libraries.
EDIT: I cannot change the function's input parameters.
Use a std::vector instead.
You could have your function take a std::string instead:
void someFunction (std::string &str) {

Deprecated conversion from string const. to wchar_t*

Hello I have a pump class that requires using a member variable that is a pointer to a wchar_t array containing the port address ie: "com9".
The problem is that when I initialise this variable in the constructor my compiler flags up a depreciated conversion warning.
pump::pump(){
this->portNumber = L"com9";}
This works fine but the warning every time I compile is anoying and makes me feel like I'm doing something wrong.
I tried creating an array and then setting the member variable like this:
pump::pump(){
wchar_t port[] = L"com9";
this->portNumber = port;}
But for some reason this makes my portNumber point at 'F'.
Clearly another conceptual problem on my part.
Thanks for help with my noobish questions.
EDIT:
As request the definition of portNumber was:
class pump
{
private:
wchar_t* portNumber;
}
Thanks to answers it has now been changed to:
class pump
{
private:
const wchar_t* portNumber;
}
If portNumber is a wchar_t*, it should be a const wchar_t*.
String literals are immutable, so the elements are const. There exists a deprecated conversion from string literal to non-const pointer, but that's dangerous. Make the change so you're keeping type safety and not using the unsafe conversion.
The second one fails because you point to the contents of a local variable. When the constructor finishes, the variable goes away and you're pointing at an invalid location. Using it results in undefined behavior.
Lastly, use an initialization list:
pump::pump() :
portNumber(L"com9")
{}
The initialization list is to initialize, the constructor is to finish construction. (Also, this-> is ugly to almost all C++ people; it's not nice and redundant.)
Use const wchar_t* to point at a literal.
The reason the conversion exists is because it has been valid from early versions of C to assign a string literal to a non-const pointer[*]. The reason it's deprecated is that it's invalid to modify a literal, and it's risky to use a non-const pointer to refer to something that must not be modified.
[*] C didn't originally have const. When const was added, clearly it should apply to string literals, but there was already code out there, written before const existed, that would break if suddenly you had to sprinkle const everywhere. We're still paying today for that breaking change to the language. Since it's C++ you're using, it wasn't even a breaking change to this language.
Apparently, portNumber is a wchar_t * (non-const), correct? If so:
the first one is wrong, because string literals are read-only (they are const pointers to an array of char usually stored in the string table of the executable, which is mapped in memory somewhere, often in a readonly page).
The ugly, implicit conversion to non-const chars/wchar_ts was approved, IIRC, to achieve compatibility with old code written when const didn't even existed; sadly, it let a lot of morons which do not know what const correctness means get away with writing code that asks non-const pointers even when const pointers would be the right choice.
The second one is wrong because you're making portNumber point to a variable allocated on the stack, which is deleted when the constructor returns. After the constructor returns, the pointer stored in portNumber points to random garbage.
The correct approach is to declare portNumber as const wchar_t * if it doesn't need to be modified. If, instead, it does need to be modified during the lifetime of the class, usually the best approach is to avoid C-style strings at all and just throw in a std::wstring, that will take care of all the bookkeeping associated with the string.

when to use const char *

If i have a function api that expects a 14 digit input and returns a 6 digit output. I basically define the input as a const char *. would that be the correct and safe thing to do?
also why would I not want to just do char * which I could but it seems more prudent to use const char * in that case especially since its an api that i am providing. so for different input values I generate 6 digit codes.
I am not sure why are you using char pointers, where you could use std::string:
std::string code(const std::string& input)
{ ... }
If you don't have the choice, using const char* gives a guarantee to the user that you won't change his data especially if it was a string literal where modifying one is undefined behavior.
By using const you're promising your user that you won't change the string being passed in. It becomes part of the API helping define your function's behavior. It also let's users pass constant strings, including literal strings like "mystring".
When you say const char *c you are telling the compiler that you will not be making any changes to the data that c points to. So this is a good practice if you will not be directly modifying your input data.
String literals have static storage class (they exist for the duration of the program) and may or may not be shared if the same string literal is referenced from multiple locations in a program. The effect of modifying a string literal is undefined; thus, you should always declare a pointer to a string literal as const char *.
You get several benefits for using const:
It documents your code, the user knows no harm will be done to this string.
You allow the user to send a const char* which he might have. Converting from non-const to const is automatic. The other way around is something that should be avoided (And done explicitly, and might lead to undefined behavior at times)
You let the compiler check you. The compiler can now verify that you don't accidentally change the user's string.
You need to use a const char * anywhere that you're passing a string literal, or the compiler will balk (assuming you don't want to convert it to a std::string).
const char* is usually used in parameters, stating that your function won't modify that string.
void function(char* modified_str, const char* not_modified_str) { ... }
If you're returning the const char* what you want to say is not obvious. You try to tell that nobody should modify the returned string, but you still (I think it would be that way) transfer the ownership to the calling routine, so that it would have to invoke delete[] on the char that your function returned.
Generally speaking, use std::string, then your function will look the following way:
std::string function(std::string& modified_str, const std::string& not_modified_str) { ... }