I have some C++ code written in C-style.
For some reasons I can't use C++ string and IO libraries, so for strings handling I should use only functions like sprintf, itoa, etc.
I want to replace C-style which requires temporary buffers
char buf[12];
itoa(x, buf, 16);
set_some_text(buf);
by the following code
class i2a
{
public:
explicit i2a(int value) { ::sprintf(buf, "%d", value); }
operator const char* () const { return buf; }
private:
char buf[12];
};
// usage:
set_some_text(i2a(x));
(Such classes can be written for char<->wchar_t convertions, etc.)
I see some cases when such classes will be dangerous:
For example, one can write
const char* someMeaningfulName = i2a(x);
// the right code should be i2a someMeaningfulName(x); or i2a someMeaningfulName = i2a(x);
set_some_text(someMeaningfulName);
In more complex case, a function which accepts text will not copy it, but will save pointer to it somewhere. For example it may be
class Foo { .... const char* p; };
Foo f(const char* text) { ... foo.p = text; return foo; }
it can be really unobvious, unlike const char* variable.
Is there a way to make such classes more secure?
Upd: why not std::string, boost::lexical_cast, boost::format, etc :
The code should work when compiled with -fno-except (C++ exceptions disabled - no throw, no stack unwinding). Also it should keep working on low memory conditions.
std::string, streams uses heap-allocated memory and at least throws bad_alloc.
When we have no free heap memory, usually we still have some kilobytes of stack (for example to write to user that we are out of memory and then make proper cleanup).
ATL and MFC String Conversion Macros are also written in this way. Calling the constructor directly like i2a(x) will create a temporary object that will live until the function to which it is passed is complete. So here: do_some(i2a(x)), the temporary object will be there until do_some() is complete.
Refer the Example section (example 2) of this msdn document
Here,
const char* someMeaningfulName = i2a(x);
set_some_text(someMeaningfulName);
This will not work as, the temporary object will be freed on the first statement itself. someMeaningfulName will be garbage. If you feel that as lack of security, all I can say is:
That's how Microsoft does it!
You will always get somewhere into troubles if you use const char* for your local variables.
You will have the same with const char* someMeaningfulName = std::string("foo").c_str();
If you can you should declare your local variable like this :
i2a someMeaningfulName(x);
set_some_text(someMeaningfulName);
You can also consider adding a copy constructor to i2a to avoid sharing the buffer between two instances.
Related
I am developing cross-platform C++ library. I wanted a better way of defining my string constants which will be used throughout the program. Which one is better static const char * or static const std::string?
Does constexpr make any of them better over the old declaration?
static const char *str = "hello world";
vs
static const string str = "hello world";
Why I need this:
My program might have to initialise a lot of static variable of string literals type before starting app. And i can't get away with static variables because they are used in the entire file. If i have to increase the performance, but reducing time of static variable initialisation. Ideally i would want most of things to happen at compile time rather than runtime. which one would be better choice?
Usually, std::string is good but in any case it depends on what you want to do. There is no absolute silver bullet here. std::string has a constructor and will call new/malloc for example (when SSO small string optimization can't be applied). If this is undesirable for any reason, go with const char*.
It depends also on where it 's to be used. If this is a HWND window title for example and the main use is with Windows API functions, it won't differ if you use std::string's c_str() method all the time.
All this is to advise that there is no 100% preference for one or another.
The usage or not of static is a different discussion.
It will depend on what you do with these variables. For example, take a look at this:
void do_stuff(std::string const&);
void stuff() {
while(/* some condition */) {
do_stuff(constant); // very hot loop
}
}
Let's take the two ways of defining the constant variable:
constexpr auto constant = "value"; // (1)
auto const constant = std::string{"value"}; // (2)
In the case of the hot loop, if you take (1), it will create a string each time. If the string value it long enough, it can cause allocation. Then (2) is better for that case.
On the other hand, if you refactor your code to something like this:
void do_stuff(std::string_view);
void stuff() {
while(/* some condition */) {
do_stuff(constant); // very hot loop
}
}
Then your program will be much faster with (1), since no dynamic allocation will occur. Your program will refer to the original read only data of your program.
With modern C++ I would use std::string_view.
If you do need it to be std::string then the best option is to define it as a static local variable in a function, so it is lazily initialized upon the first request.
/* inline or not */
std::string const& get_that_string() {
static std::string const str { "value" };
return value;
}
Unless there are known problems in your specific code base with using std::string, I strongly suggest using std::string.
static std::string const str = "hello world";
In general you should prefer std::string over char*. static has nothing to do with it.
In C++ is perfectly legitimate to do:
bool x = "hi";
Because "hi" is translated by compiler to a char array and returns a pointer to that array, which is a number and number can be implicitly converted to bool (0 is false, anything else is true).
So I have these ctor:
Exception(QString Text, bool __IsRecoverable = true);
Exception(QString Text, QString _Source, bool __IsRecoverable = true);
Sadly I figured out that calling
Exception *e = new Exception("error happened", "main.cpp #test");
It creates a new instance of "Exception" class which is created using Exception(QString Text, bool __IsRecoverable = true); constructor, which is wrong to a point.
Is there a simple way to ensure that correct function is called, other than restructuring the constructors entirely, changing position of arguments, etc?
Firstly, I'm not sure why you're dynamically allocating an exception class. I'm not sure that's ever a good idea.
You can explicitly construct a QString:
Exception e("error happened", QString("main.cpp #test"));
Or you can pass the third argument:
Exception e("error happened", "main.cpp #test", true);
Or you can add an additional constructor that takes const char* and will be preferred over the conversion to bool:
Exception(QString Text, const char* Source, bool IsRecoverable = true);
You can easily make this forward to the QString version. Also note that names beginning with an underscore and a capital letter or with two underscores are reserved.
My suggestion would be to not use default arguments. They contribute to overload resolution problems like this, and anyway it is not very readable to just see true as an argument. Whoever's reading the code then has to stop and go look up what the true means. Even if it's yourself you may forget it in a few months time when you come back to the code, especially if you do this sort of thing a lot.
For example:
struct Exception: public whatever
{
Exception(char const *text);
Exception(char const *text, char const *source);
};
struct RecoverableException: public Exception
{
RecoverableException(char const *text);
RecoverableException(char const *text, char const *source);
};
It's a little bit more typing in this source file but the payoff is that your code which actually uses the exceptions is simpler and clearer.
To implement these constructors you could have them all call a particular function in the .cpp file with relevant arguments selecting which behaviour you want.
I have a preference for using char const * rather than QString as I am paranoid about two things:
unwanted conversions
memory allocation failure
If constructing a QString throws then things go downhill fast. But you may choose to not worry about this possibility because if the system ran out of memory and your exception handling doesn't prepare for that possibility then it's going to terminate either way.
In paragraph "How should I design my exception classes?" in this "Error and Exception Handling" Boost web page, it reads:
[...] 3. Don't embed a std::string object or any other data member or base class whose copy constructor could throw an exception.
I have to define an exception class to represent some form of run-time error on file access, so I was thinking to derive it from std::runtime_error, and add a FileName() attribute to get access to the file name for which the error occurred.
For simplicity sake, my intention was to add a std::wstring data member to store the file name (in Unicode), but the aforementioned suggestion kind of stopped me.
So, should I use a simple wchar_t buffer as a data member?
On modern desktop systems (which are my target platforms for this project), is it really important to pay attention to a dynamic string allocation for a file name? What is the likelihood of such allocation to fail? I can understand Boost's suggestion for limited-resource systems like embedded systems, but is it valid also for modern desktop PC's?
//
// Original design, using std::wstring.
//
class FileIOError : public std::runtime_error
{
public:
FileIOError(HRESULT errorCode, const std::wstring& filename, const char* message)
: std::runtime_error(message),
m_errorCode(errorCode),
m_filename(filename)
{
}
HRESULT ErrorCode() const
{
return m_errorCode;
}
const std::wstring& FileName() const
{
return m_filename;
}
private:
HRESULT m_errorCode;
std::wstring m_filename;
};
//
// Using raw wchar_t buffer, following Boost's guidelines.
//
class FileIOError : public std::runtime_error
{
public:
FileIOError(HRESULT errorCode, const wchar_t* filename, const char* message)
: std::runtime_error(message),
m_errorCode(errorCode)
{
// Safe string copy
// EDIT: use wcsncpy_s() with _TRUNCATE, as per Hans Passant's suggestion.
wcsncpy_s(m_filename, filename, _TRUNCATE);
}
HRESULT ErrorCode() const
{
return m_errorCode;
}
const wchar_t* FileName() const
{
return m_filename;
}
private:
HRESULT m_errorCode;
wchar_t m_filename[MAX_PATH]; // circa 260 wchar_t's
};
What is the likelihood of such allocation to fail?
Pretty low.
Usually for the purpose of code correctness you only really care whether that likelihood is zero or non-zero. Even on a system where the built in ::operator new never fails due to rampant over-commit, consider the likelihood that your code will be used in a program that replaces ::operator new. Or maybe an OS will externally limit the amount of memory a process is permitted to allocate, by ulimit -v or otherwise.
Next consider the consequences of it failing. terminate() is called. Maybe you can live with that, especially since it's unlikely to actually happen.
Basically, do you want your program to even try to exit cleanly with a reasonable error message in the case where you can't allocate memory? If so, write the extra code and accept the limit on the length of the error message. Because Boost is general-purpose library code, it doesn't assume on behalf of its users that they don't want to try.
If you are using C++11, you can use move semantics and move constructor of std::wstring, which is noexcept.
I've created a function that will convert all the event notification codes to strings. Pretty simple stuff really.
I've got a bunch of consts like
const _bstr_t DIRECTSHOW_MSG_EC_ACTIVATE("A video window is being activated or deactivated.");
const _bstr_t DIRECTSHOW_MSG_EC_BUFFERING_DATA("The graph is buffering data, or has stopped buffering data.");
const _bstr_t DIRECTSHOW_MSG_EC_BUILT("Send by the Video Control when a graph has been built. Not forwarded to applications.");
.... etc....
and my function
TCHAR* GetDirectShowMessageDisplayText( int messageNumber )
{
switch( messageNumber )
{
case EC_ACTIVATE: return DIRECTSHOW_MSG_EC_ACTIVATE;
case EC_BUFFERING_DATA: return DIRECTSHOW_MSG_EC_BUFFERING_DATA;
case EC_BUILT: return DIRECTSHOW_MSG_EC_BUILT;
... etc ...
No big deal. Took me 5 minutes to throw together.
... but I simply don't trust that I've got all the possible values, so I want to have a default to return something like "Unexpected notification code (7410)" if no matches are found.
Unfortunately, I can't think of anyway to return a valid pointer, without forcing the caller to delete the string's memory ... which is not only nasty, but also conflicts with the simplicity of the other return values.
So I can't think of any way to do this without changing the return value to a parameter where the user passes in a buffer and a string length. Which would make my function look like
BOOL GetDirectShowMessageDisplayText( int messageNumber, TCHAR* outBuffer, int bufferLength )
{
... etc ...
I really don't want to do that. There must be a better way.
Is there?
I'm coming back to C++ after a 10 year hiatus, so if it's something obvious, don't discount that I've overlooked it for a reason.
C++? std::string. It's not going to destroy the performance on any modern computer.
However if you have some need to over-optimize this, you have three options:
Go with the buffer your example has.
Have the users delete the string afterwards. Many APIs like this provide their own delete function for deleting each kind of dynamically allocated return data.
Return a pointer to a static buffer which you fill in with the return string on each call. This does have some drawbacks, though, in that it's not thread safe, and it can be confusing because the returned pointer's value will change the next time someone calls the function. If non-thread-safety is acceptable and you document the limitations, it should be all right though.
If you are returning a point to a string constant, the caller will not have to delete the string - they'll only have to if you are new-ing the memory used by the string every time. If you're just returning a pointer to a string entry in a table of error messages, I would change the return type to TCHAR const * const and you should be OK.
Of course this will not prevent users of your code to attempt to delete the memory referenced by the pointer but there is only so much you can do to prevent abuse.
Just declare use a static string as a default result:
TCHAR* GetDirectShowMessageDisplayText( int messageNumber )
{
switch( messageNumber )
{
// ...
default:
static TCHAR[] default_value = "This is a default result...";
return default_value;
}
}
You may also declare "default_value" outside of the function.
UPDATE:
If you want to insert a message number in that string then it won't be thread-safe (if you are using multiple threads). However, the solution for that problem is to use thread-specific string. Here is an example using Boost.Thread:
#include <cstdio>
#include <boost/thread/tss.hpp>
#define TCHAR char // This is just because I don't have TCHAR...
static void errorMessageCleanup (TCHAR *msg)
{
delete []msg;
}
static boost::thread_specific_ptr<TCHAR> errorMsg (errorMessageCleanup);
static TCHAR *
formatErrorMessage (int number)
{
static const size_t MSG_MAX_SIZE = 256;
if (errorMsg.get () == NULL)
errorMsg.reset (new TCHAR [MSG_MAX_SIZE]);
snprintf (errorMsg.get (), MSG_MAX_SIZE, "Unexpected notification code (%d)", number);
return errorMsg.get ();
}
int
main ()
{
printf ("Message: %s\n", formatErrorMessage (1));
}
The only limitation of this solution is that returned string cannot be passed by the client to the other thread.
Perhaps have a static string buffer you return a pointer to:
std::ostringstream ss;
ss << "Unexpected notification code (" << messageNumber << ")";
static string temp = ss.str(); // static string always has a buffer
return temp.c_str(); // return pointer to buffer
This is not thread safe, and if you persistently hold the returned pointer and call it twice with different messageNumbers, they all point to the same buffer in temp - so both pointers now point to the same message. The solution? Return a std::string from the function - that's modern C++ style, try to avoid C style pointers and buffers. (It looks like you might want to invent a tstring which would be std::string in ANSI and std::wstring in unicode, although I'd recommend just going unicode-only... do you really have any reason to support non-unicode builds?)
You return some sort of self-releasing smart pointer or your own custom string class. You should follow the interface as it's defined in std::string for easiest use.
class bstr_string {
_bstr_t contents;
public:
bool operator==(const bstr_string& eq);
...
~bstr_string() {
// free _bstr_t
}
};
In C++, you never deal with raw pointers unless you have an important reason, you always use self-managing classes. Usually, Microsoft use raw pointers because they want their interfaces to be C-compatible, but if you don't care, then don't use raw pointers.
The simple solution does seem to be to just return a std::string. It does imply one dynamic memory allocation, but you'd probably get that in any case (as either the user or your function would have to make the allocation explicitly)
An alternative might be to allow the user to pass in an output iterator which you write the string into. Then the user is given complete control over how and when to allocate and store the string.
On the first go-round I missed that this was a C++ question rather than a plain C question. Having C++ to hand opens up another possibility: a self-managing pointer class that can be told whether or not to delete.
class MsgText : public boost::noncopyable
{
const char* msg;
bool shouldDelete;
public:
MsgText(const char *msg, bool shouldDelete = false)
: msg(msg), shouldDelete(shouldDelete)
{}
~MsgText()
{
if (shouldDelete)
free(msg);
}
operator const char*() const
{
return msg;
}
};
const MsgText GetDirectShowMessageDisplayText(int messageNumber)
{
switch(messageNumber)
{
case EC_ACTIVATE:
return MsgText("A video window is being activated or deactivated.");
// etc
default: {
char *msg = asprintf("Undocumented message (%u)", messageNumber);
return MsgText(msg, true);
}
}
}
(I don't remember if Windows CRT has asprintf, but it's easy enough to rewrite the above on top of std::string if it doesn't.)
Note the use of boost::noncopyable, though - if you copy this kind of object you risk double frees. Unfortunately, that may cause problems with returning it from your message-pretty-printer function. I'm not sure what the right way to deal with that is, I'm not actually much of a C++ guru.
You already use _bstr_t, so if you can just return those directly:
_bstr_t GetDirectShowMessageDisplayText(int messageNumber);
If you need to build a different message at runtime you can pack it into a _bstr_t too. Now the ownership is clear and the use is still simple thanks to RAII.
The overhead is negligible (_bstr_t uses ref-counting) and the calling code can still use _bstr_ts conversion to wchar_t* and char* if needed.
There's no good answer here, but this kludge might suffice.
const char *GetDirectShowMessageDisplayText(int messageNumber)
{
switch(messageNumber)
{
// ...
default: {
static char defaultMessage[] = "Unexpected notification code #4294967296";
char *pos = defaultMessage + sizeof "Unexpected notification code #" - 1;
snprintf(pos, sizeof "4294967296" - 1, "%u", messageNumber);
return defaultMessage;
}
}
}
If you do this, callers must be aware that the string they get back from GetDirectShowMessageText might be clobbered by a subsequent call to the function. And it's not thread safe, obviously. But those might be acceptable limitations for your application.
I am working in C++ with two large pieces of code, one done in "C style" and one in "C++ style".
The C-type code has functions that return const char* and the C++ code has in numerous places things like
const char* somecstylefunction();
...
std::string imacppstring = somecstylefunction();
where it is constructing the string from a const char* returned by the C style code.
This worked until the C style code changed and started returning NULL pointers sometimes. This of course causes seg faults.
There is a lot of code around and so I would like to most parsimonious way fix to this problem. The expected behavior is that imacppstring would be the empty string in this case. Is there a nice, slick solution to this?
Update
The const char* returned by these functions are always pointers to static strings. They were used mostly to pass informative messages (destined for logging most likely) about any unexpected behavior in the function. It was decided that having these return NULL on "nothing to report" was nice, because then you could use the return value as a conditional, i.e.
if (somecstylefunction()) do_something;
whereas before the functions returned the static string "";
Whether this was a good idea, I'm not going to touch this code and it's not up to me anyway.
What I wanted to avoid was tracking down every string initialization to add a wrapper function.
Probably the best thing to do is to fix the C library functions to their pre-breaking change behavior. but maybe you don't have control over that library.
The second thing to consider is to change all the instances where you're depending on the C lib functions returning an empty string to use a wrapper function that'll 'fix up' the NULL pointers:
const char* nullToEmpty( char const* s)
{
return (s ? s : "");
}
So now
std::string imacppstring = somecstylefunction();
might look like:
std::string imacppstring( nullToEmpty( somecstylefunction());
If that's unacceptable (it might be a lot of busy work, but it should be a one-time mechanical change), you could implement a 'parallel' library that has the same names as the C lib you're currently using, with those functions simply calling the original C lib functions and fixing the NULL pointers as appropriate. You'd need to play some tricky games with headers, the linker, and/or C++ namespaces to get this to work, and this has a huge potential for causing confusion down the road, so I'd think hard before going down that road.
But something like the following might get you started:
// .h file for a C++ wrapper for the C Lib
namespace clib_fixer {
const char* somecstylefunction();
}
// .cpp file for a C++ wrapper for the C Lib
namespace clib_fixer {
const char* somecstylefunction() {
const char* p = ::somecstylefunction();
return (p ? p : "");
}
}
Now you just have to add that header to the .cpp files that are currently calling calling the C lib functions (and probably remove the header for the C lib) and add a
using namespace clib_fixer;
to the .cpp file using those functions.
That might not be too bad. Maybe.
Well, without changing every place where a C++ std::string is initialized directly from a C function call (to add the null-pointer check), the only solution would be to prohibit your C functions from returning null pointers.
In GCC compiler, you can use a compiler extension "Conditionals with Omitted Operands" to create a wrapper macro for your C function
#define somecstylefunction() (somecstylefunction() ? : "")
but in general case I would advise against that.
I suppose you could just add a wrapper function which tests for NULL, and returns an empty std::string. But more importantly, why are your C functions now returning NULL? What does a NULL pointer indicate? If it indicates a serious error, you might want your wrapper function to throw an exception.
Or to be safe, you could just check for NULL, handle the NULL case, and only then construct an std::string.
const char* s = somecstylefunction();
if (!s) explode();
std::string str(s);
For a portable solution:
(a) define your own string type. The biggest part is a search and replace over the entire project - that can be simple if it's always std::string, or big one-time pain. (I'd make the sole requriement that it's Liskov-substitutable for a std::string, but also constructs an empty string from an null char *.
The easiest implementation is inheriting publicly from std::string. Even though that's frowned upon (for understandable reasons), it would be ok in this case, and also help with 3rd party libraries expecting a std::string, as well as debug tools. Alternatively, aggegate and forward - yuck.
(b) #define std::string to be your own string type. Risky, not recommended. I wouldn't do it unless I knew the codebases involved very well and saves you tons of work (and I'd add some disclaimers to protect the remains of my reputation ;))
(c) I've worked around a few such cases by re-#define'ing the offensive type to some utility class only for the purpose of the include (so the #define is much more limited in scope). However, I have no idea how to do that for a char *.
(d) Write an import wrapper. If the C library headers have a rather regular layout, and/or you know someone who has some experience parsing C++ code, you might be able to generate a "wrapper header".
(e) ask the library owner to make the "Null string" value configurable at least at compile time. (An acceptable request since switching to 0 can break compatibility as well in other scenarios) You might even offer to submit the change yourself if that's less work for you!
You could wrap all your calls to C-stlye functions in something like this...
std::string makeCppString(const char* cStr)
{
return cStr ? std::string(cStr) : std::string("");
}
Then wherever you have:
std::string imacppstring = somecstylefunction();
replace it with:
std::string imacppstring = makeCppString( somecystylefunction() );
Of course, this assumes that constructing an empty string is acceptable behavior when your function returns NULL.
I don't generally advocate subclassing standard containers, but in this case it might work.
class mystring : public std::string
{
// ... appropriate constructors are an exercise left to the reader
mystring & operator=(const char * right)
{
if (right == NULL)
{
clear();
}
else
{
std::string::operator=(right); // I think this works, didn't check it...
}
return *this;
}
};
Something like this should fix your problem.
const char *cString;
std::string imacppstring;
cString = somecstylefunction();
if (cString == NULL) {
imacppstring = "";
} else {
imacppstring = cString;
}
If you want, you could stick the error checking logic in its own function. You'd have to put this code block in fewer places, then.