Many windows APIs take a pointer to a buffer and a size element but the result needs to go into a c++ string. (I'm using windows unicode here so they are wstrings)
Here is an example :-
#include <iostream>
#include <string>
#include <vector>
#include <windows.h>
using namespace std;
// This is the method I'm interested in improving ...
wstring getComputerName()
{
vector<wchar_t> buffer;
buffer.resize(MAX_COMPUTERNAME_LENGTH+1);
DWORD size = MAX_COMPUTERNAME_LENGTH;
GetComputerNameW(&buffer[0], &size);
return wstring(&buffer[0], size);
}
int main()
{
wcout << getComputerName() << "\n";
}
My question really is, is this the best way to write the getComputerName function so that it fits into C++ better, or is there a better way? I don't see any way to use a string directly without going via a vector unless I missed something? It works fine, but somehow seems a little ugly. The question isn't about that particular API, it's just a convenient example.
In this case, I don't see what std::vector brings to the party. MAX_COMPUTERNAME_LENGTH is not likely to be very large, so I would simply use a C-style array as the temporary buffer.
See this answer to another question. It provides the source to a StringBuffer class which handles this situation very cleanly.
I would say, since you are already at task of abstracting Windows API behind a more generic C++ interface, do away with vector altogether, and don't bother about wstring constructor:
wstring getComputerName()
{
wchar_t name[MAX_COMPUTERNAME_LENGTH + 1];
DWORD size = MAX_COMPUTERNAME_LENGTH;
GetComputerNameW(name, &size);
return name;
}
This function will return a valid wstring object.
I'd use the vector. In response to you saying you picked a bad example, pretend for a moment that we don't have a reasonable constant upper bound on the string length. Then it's not quite as easy:
#include <string>
#include <vector>
#include <windows.h>
using std::wstring;
using std::vector;
wstring getComputerName()
{
DWORD size = 1; // or a bigger number if you like
vector<wchar_t> buffer(size);
while ((GetComputerNameW(&buffer[0], &size) == 0))
{
if (GetLastError() != ERROR_BUFFER_OVERFLOW) aargh(); // handle error
buffer.resize(++size);
};
return wstring(&buffer[0], size);
}
In practice, you can probably get away with writing into a string, but I'm not entirely sure. You certainly need additional guarantees made by your implementation of std::wstring, beyond what's in the standard, but I expect MSVC's strings are probably OK.
I think that if wstring::reference is wchar_t& then you're sorted. 21.3.4 defines that non-const operator[] returns a reference, and that it returns data()[pos]. So if reference is just a plain wchar_t& then there's no scope for exciting copy-on-write behaviour through the reference, and the string must in fact be modifiable through the pointer &buffer[0]. I think. The basic problem here is that the standard allowed implementations more flexibility than turned out to be needed.
That's a lot of effort and commenting though, just to avoid copying a string, so I've never felt the need to avoid an intermediate array/vector.
Related
Consider a scenario, where std::string is used to store a secret. Once it is consumed and is no longer needed, it would be good to cleanse it, i.e overwrite the memory that contained it, thus hiding the secret.
std::string provides a function const char* data() returning a pointer to (since C++11) continous memory.
Now, since the memory is continous and the variable will be destroyed right after the cleanse due to scope end, would it be safe to:
char* modifiable = const_cast<char*>(secretString.data());
OpenSSL_cleanse(modifiable, secretString.size());
According to standard quoted here:
$5.2.11/7 - Note: Depending on the type of the object, a write operation through the pointer, lvalue or pointer to data member resulting from a const_cast that casts away a const-qualifier68 may produce undefined behavior (7.1.5.1).
That would advise otherwise, but do the conditions above (continuous, to-be-just-removed) make it safe?
The standard explicitly says you must not write to the const char* returned by data(), so don't do that.
There are perfectly safe ways to get a modifiable pointer instead:
if (secretString.size())
OpenSSL_cleanse(&secretString.front(), secretString.size());
Or if the string might have been shrunk already and you want to ensure its entire capacity is wiped:
if (secretString.capacity()) {
secretString.resize(secretString.capacity());
OpenSSL_cleanse(&secretString.front(), secretString.size());
}
It is probably safe. But not guaranteed.
However, since C++11, a std::string must be implemented as contiguous data so you can safely access its internal array using the address of its first element &secretString[0].
if(!secretString.empty()) // avoid UB
{
char* modifiable = &secretString[0];
OpenSSL_cleanse(modifiable, secretString.size());
}
std::string is a poor choice to store secrets. Since strings are copyable and sometimes copies go unnoticed, your secret may "get legs". Furthermore, string expansion techniques may cause multiple copies of fragments (or all of) your secrets.
Experience dictates a movable, non-copyable, wiped clean on destroy, unintelligent (no tricky copies under-the-hood) class.
You can use std::fill to fill the string with trash:
std::fill(str.begin(),str.end(), 0);
Do note that simply clearing or shrinking the string (with methods such clear or shrink_to_fit) does not guarantee that the string data will be deleted from the process memory. Malicious processes may dump the process memory and can extract the secret if the string is not overwritten correctly.
Bonus: Interestingly, the ability to trash the string data for security reasons forces some programming languages like Java to return passwords as char[] and not String. In Java, String is immutable, so "trashing" it will make a new copy of the string. Hence, you need a modifiable object like char[] which does not use copy-on-write.
Edit: if your compiler does optimize this call out, you can use specific compiler flags to make sure a trashing function will not be optimized out:
#ifdef WIN32
#pragma optimize("",off)
void trashString(std::string& str){
std::fill(str.begin(),str.end(),0);
}
#pragma optimize("",on)
#endif
#ifdef __GCC__
void __attribute__((optimize("O0"))) trashString(std::string& str) {
std::fill(str.begin(),str.end(),0);
}
#endif
#ifdef __clang__
void __attribute__ ((optnone)) trashString(std::string& str) {
std::fill(str.begin(),str.end(),0);
}
#endif
There's a better answer: don't!
std::string is a class which is designed to be userfriendly and efficient. It was not designed with cryptography in mind, so there are few guarantees written into it to help you out. For example, there's no guarantees that your data hasn't been copied elsewhere. At best, you could hope that a particular compiler's implementation offers you the behavior you want.
If you actually want to treat a secret as a secret, you should handle it using tools which are designed for handling secrets. In fact, you should develop a threat model for what capabilities your attacker has, and choose your tools accordingly.
Tested solution on CentOS 6, Debian 8 and Ubuntu 16.04 (g++/clang++, O0, O1, O2, O3):
secretString.resize(secretString.capacity(), '\0');
OPENSSL_cleanse(&secretString[0], secretString.size());
secretString.clear();
If you were really paranoid you could randomise the data in the cleansed string, so as not to give away the length of the string or a location that contained sensitive data:
#include <string>
#include <stdlib.h>
#include <string.h>
typedef void* (*memset_t)(void*, int, size_t);
static volatile memset_t memset_func = memset;
void cleanse(std::string& to_cleanse) {
to_cleanse.resize(to_cleanse.capacity(), '\0');
for (int i = 0; i < to_cleanse.size(); ++i) {
memset_func(&to_cleanse[i], rand(), 1);
}
to_cleanse.clear();
}
You could seed the rand() if you wanted also.
You could also do similar string cleansing without openssl dependency, by using explicit_bzero to null the contents:
#include <string>
#include <string.h>
int main() {
std::string secretString = "ajaja";
secretString.resize(secretString.capacity(), '\0');
explicit_bzero(&secretString[0], secretString.size());
secretString.clear();
return 0;
}
Is there a way to get the "raw" buffer o a std::string?
I'm thinking of something similar to CString::GetBuffer(). For example, with CString I would do:
CString myPath;
::GetCurrentDirectory(MAX_PATH+1, myPath.GetBuffer(MAX_PATH));
myPath.ReleaseBuffer();
So, does std::string have something similar?
While a bit unorthodox, it's perfectly valid to use std::string as a linear memory buffer, the only caveat is that it isn't supported by the standard until C++11 that is.
std::string s;
char* s_ptr = &s[0]; // get at the buffer
To quote Herb Sutter,
Every std::string implementation I know of is in fact contiguous and null-terminates its buffer. So, although it isn’t formally
guaranteed, in practice you can probably get away with calling &str[0]
to get a pointer to a contiguous and null-terminated string. (But to
be safe, you should still use str.c_str().)
"Probably" is key here. So, while it's not a guarantee, you should be able to rely on the principle that std::string is a linear memory buffer and you should assert facts about this in your test suite, just to be sure.
You can always build your own buffer class but when you're looking to buy, this is what the STL has to offer.
Use std::vector<char> if you want a real buffer.
#include <vector>
#include <string>
int main(){
std::vector<char> buff(MAX_PATH+1);
::GetCurrentDirectory(MAX_PATH+1, &buff[0]);
std::string path(buff.begin(), buff.end());
}
Example on Ideone.
Not portably, no. The standard does not guarantee that std::strings have an exclusive linear representation in memory (and with the old C++03 standard, even data-structures like ropes are permitted), so the API does not give you access to it. They must be able to change their internal representation to that (in C++03) or give access to their linear representation (if they have one, which is enforced in C++11), but only for reading. You can access this using data() and/or c_str(). Because of that, the interface still supports copy-on-write.
The usual recommendation for working with C-APIs that modify arrays by accessing through pointers is to use an std::vector, which is guaranteed to have a linear memory-representation exactly for this purpose.
To sum this up: if you want to do this portably and if you want your string to end up in an std::string, you have no choice but to copy the result into the string.
It has c_str, which on all C++ implementations that I know returns the underlying buffer (but as a const char *, so you can't modify it).
std::string str("Hello world");
LPCSTR sz = str.c_str();
Keep in mind that sz will be invalidated when str is reallocated or goes out of scope. You could do something like this to decouple from the string:
std::vector<char> buf(str.begin(), str.end()); // not null terminated
buf.push_back(0); // null terminated
Or, in oldfashioned C style (note that this will not allow strings with embedded null-characters):
#include <cstring>
char* sz = strdup(str.c_str());
// ... use sz
free(sz);
According to this MSDN article, I think this is the best approach for what you want to do using std::wstring directly. Second best is std::unique_ptr<wchar_t[]> and third best is using std::vector<wchar_t>. Feel free to read the article and draw you own conclusions.
// Get the length of the text string
// (Note: +1 to consider the terminating NUL)
const int bufferLength = ::GetWindowTextLength(hWnd) + 1;
// Allocate string of proper size
std::wstring text;
text.resize(bufferLength);
// Get the text of the specified control
// Note that the address of the internal string buffer
// can be obtained with the &text[0] syntax
::GetWindowText(hWnd, &text[0], bufferLength);
// Resize down the string to avoid bogus double-NUL-terminated strings
text.resize(bufferLength - 1);
I think you will be frowned upon by the purists of STD cult for doing this. In any case, its much better to not relay on bloated and generic standard library if you want dynamic string type that can be easily passed to low level API functions that will modify its buffer and size at the same time, without any conversions, than you will have to implement it! Its actually very challenging and interesting task to do. For example in my custom txt type I overload this operators:
ui64 operator~() const; // Size operator
uli32 * operator*(); // Size modification operator
ui64 operator!() const; // True Size Operator
txt& operator--(); // Trimm operator
And also this casts:
operator const char *() const;
operator char *();
And as such, i can pass txt type to low level API functions directly, without even calling any .c_str(). I can then also pass the API function it's true size (i.e. size of buffer) and also pointer to internal size variable (operator*()), so that API function can update amount of characters written, thus giving valid string without the need to call stringlength at all!
I tried to mimic basic types with this txt, so it has no public functions at all, all public interface is only via operators. This way my txt fits perfectly with ints and other fundamental types.
I know there is a similarly titled question already on SO but I want to know my options for this specific case.
MSVC compiler gives a warning about strcpy:
1>c:\something\mycontrol.cpp(65): warning C4996: 'strcpy': This function or
variable may be unsafe. Consider using strcpy_s instead. To disable
deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
Here's my code:
void MyControl::SetFontFace(const char *faceName)
{
LOGFONT lf;
CFont *currentFont = GetFont();
currentFont->GetLogFont(&lf);
strcpy(lf.lfFaceName, faceName); <--- offending line
font_.DeleteObject();
// Create the font.
font_.CreateFontIndirect(&lf);
// Use the font to paint a control.
SetFont(&font_);
}
Note font_ is an instance variable. LOGFONT is a windows structure where lfFaceName is defined as TCHAR lfFaceName[LF_FACESIZE].
What I'm wondering is can I do something like the following (and if not why not):
void MyControl::SetFontFace(const std::string& faceName)
...
lf.lfFaceName = faceName.c_str();
...
Or if there is a different alternative altogether then let me know.
The reason you're getting the security warning is, your faceName argument could point to a string that is longer than LF_FACESIZE characters, and then strcpy would blindly overwrite whatever comes after lfFaceName in the LOGFONT structure. You do have a bug.
You should not blindly fix the bug by changing strcpy to strcpy_s, because:
The *_s functions are unportable Microsoft inventions almost all of which duplicate the functionality of other C library functions that are portable. They should never be used, even in a program not intended to be portable (as this appears to be).
Blind changes tend to not actually fix this class of bug. For instance, the "safe" variants of strcpy (strncpy, strlcpy, strcpy_s) simply truncate the string if it's too long, which in this case would make you try to load the wrong font. Worse, strncpy omits the NUL terminator when it does that, so you'd probably just move the crash inside CreateFontIndirect if you used that one. The correct fix is to check the length up front and fail the entire operation if it's too long. At which point strcpy becomes safe (because you know it's not too long), although I prefer memcpy because it makes it obvious to future readers of the code that I've thought about this.
TCHAR and char are not the same thing; copying either a C-style const char * string or a C++ std::string into an array of TCHAR without a proper encoding conversion may produce complete nonsense. (Using TCHAR is, in my experience, always a mistake, and the biggest problem with it is that code like this will appear to work correctly in an ASCII build, and will still compile in UNICODE mode, but will then fail catastrophically at runtime.)
You certainly can use std::string to help with this problem, but it won't get you out of needing to check the length and manually copy the string. I'd probably do it like this. Note that I am using LOGFONTW and CreateFontIndirectW and an explicit conversion from UTF-8 in the std::string. Note also that chunks of this were cargo-culted out of MSDN and none of it has been tested. Sorry.
void MyControl::SetFontFace(const std::string& faceName)
{
LOGFONTW lf;
this->font_.GetLogFontW(&lf);
int count = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS,
faceName.data(), faceName.length(),
lf.lfFaceName, LF_FACESIZE - 1)
if (count <= 0)
throw GetLastError(); // FIXME: use a real exception
lf.lfFaceName[count] = L'\0'; // MultiByteToWideChar does not NUL-terminate.
this->font_.DeleteObject();
if (!this->font_.CreateFontIndirectW(&lf))
throw GetLastError(); // FIXME: use a real exception
// ...
}
lf.lfFaceName = faceName.c_str();
No you shouldn't do that because you are making a local copy of the poitner to the data held inside the std::string. If the c++ string changes, or is deleted, the pointer is no longer valid, and if lFaceName decides to change the data this will almost certainly break the std::string.
Since you need to copy a c string, you need a 'c' function, then strcpy_s (or it's equivalent) is the safe alternative
Have you tried? Given the information in your post, the assignment should generate a compiler error because you're trying to assign a pointer to an array, which does not work in C(++).
#include <cstdio>
#include <string>
using namespace std;
struct LOGFONT {
char lfFaceName[3];
};
int main() {
struct LOGFONT f;
string foo="bar";
f.lfFaceName = foo.c_str();
return 0;
}
leads to
x.c:13: error: incompatible types in assignment of `const char*' to `char[3]'
I'd recommend using a secure strcpy alternative like the warning says, given that you know the size of the destination space anyway.
#include <algorithm>
#include <iostream>
#include <string>
enum { LF_FACESIZE = 256 }; // = 3 // test too-long input
struct LOGFONT
{
char lfFaceName[LF_FACESIZE];
};
int main()
{
LOGFONT f;
std::string foo("Sans-Serif");
std::copy_n(foo.c_str(), foo.size()+1 > LF_FACESIZE ? LF_FACESIZE : foo.size()+1,
f.lfFaceName);
std::cout << f.lfFaceName << std::endl;
return 0;
}
lf.lfFaceName = faceName.c_str(); won't work for two reasons (assuming you change faceName to a std:string)
The lifetime of the pointer returned by c_str() is temporary. It's only valid as long as the fileName object doesn't change and in alive.
The line won't compile. .c_str() returns a pointer to a char, and lfFaceName is a character array and can't be assigned to. You need to do something to fill in the string array, to fill in the bytes at lfFaceName, and pointer assignment doesn't do that.
There isn't anything C++ that can help here, since lfFaceName is a C "string". You need to use a C string function, like strcpy or strcpy_s. You can change your code to:
strcpy_s(lf.lfFaceName, LF_FACESIZE, faceName);
Basically my task is having to sort a bunch of strings of variable length ignoring case. I understand there is a function strcasecmp() that compares cstrings, but doesn't work on strings. Right now I'm using getline() for strings so I can just read in the strings one line at a time. I add these to a vector of strings, then convert to cstrings for each call of strcasecmp(). Instead of having to convert each string to a cstring before comparing with strcasecmp(), I was wondering if there was a way I could use cin.getline() for cstrings without having a predefined char array size. Or, would the best solution be to just read in string, convert to cstring, store in vector, then sort?
I assume by "convert to cstring" you mean using the c_str() member of string. If that is the case, in most implementation that isn't really a conversion, it's just an accessor. The difference is only important if you are worried about performance (which it sounds like you are). Internally std::strings are (pretty much always, but technically do not have to be) represented as a "cstring". The class takes care of managing it's size for you, but it's just a dynamically allocated cstring underneath.
So, to directly answer: You have to specify the size of the array when using cin.getline. If you don't want to specify a size, then use getline and std::string. There's nothing wrong with that approach.
C++ is pretty efficient on its own. Unless you have a truly proven need to do otherwise, let it do its thing.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
#include <cstring>
using namespace std;
bool cmp(string a, string b)
{
return(strcasecmp(a.c_str(), b.c_str()) < 0);
}
int main(int argc, char *argv[])
{
vector<string> strArr;
//too lazy to test with getline(cin, str);
strArr.push_back("aaaaa");
strArr.push_back("AAAAA");
strArr.push_back("ababab");
strArr.push_back("bababa");
strArr.push_back("abcabc");
strArr.push_back("cbacba");
strArr.push_back("AbCdEf");
strArr.push_back("aBcDeF");
strArr.push_back(" whatever");
sort(strArr.begin(), strArr.end(), cmp);
copy(strArr.begin(), strArr.end(), ostream_iterator<string>(cout, " \n"));
return(0);
}
What is the most optimal way to achieve the same as this?
void foo(double floatValue, char* stringResult)
{
sprintf(stringResult, "%f", floatValue);
}
I'm sure someone will say boost::lexical_cast, so go for that if you're using boost, but it's basically the same as this anyway:
#include <sstream>
#include <string>
std::string doubleToString(double d)
{
std::ostringstream ss;
ss << d;
return ss.str();
}
Note that you could easily make this into a template that works on anything that can be stream-inserted (not just doubles).
http://www.cplusplus.com/reference/iostream/stringstream/
double d=123.456;
stringstream s;
s << d; // insert d into s
Boost::lexical_cast<>
On dinkumware STL, the stringstream is filled out by the C library snprintf.
Thus using snprintf formatting directly will be comparable with the STL formatting part.
But someone once told me that the whole is greater than or equal to the sum of its known parts.
As it will be platform dependent as to whether stringstream will do an allocation (and I am quite sure that DINKUMWARE DOES NOT YET include a small buffer in stringstream for conversions of single items like yours) it is truely doubtful that ANYTHING that requires an allocation (ESPECIALLY if MULTITHREADED) can compete with snprintf.
In fact (formatting+allocation) has a chance of being really terrible as an allocation and a release might well require 2 full read-modify-write cycles in a multithreaded environment unless the allocation implementation has a thread local small heap.
That being said, if I was truely concerned about performance, I would take the advice from some of the other comments above, change the interface to include a size and use snprintf - i.e.
bool
foo(const double d, char* const p, const size_t n){
use snprintf......
determine if it fit, etc etc etc.
}
If you want a std::string you are still better off using the above and instantiating the string from the resultant char* as there will be 2 allocations + 2 releases involved with the std::stringstream, std::string solution.
BTW I cannot tell if the "string" in the question is std::string or just generic ascii chars usage of "string"
The best thing to do would be to build a simple templatized function to convert any streamable type into a string. Here's the way I do it:
#include <sstream>
#include <string>
template <typename T>
const std::string to_string(const T& data)
{
std::ostringstream conv;
conv << data;
return conv.str();
}
If you want a const char* representation, simply substitute conv.str().c_str() in the above.
I'd probably go with what you suggested in your question, since there's no built-in ftoa() function and sprintf gives you control over the format. A google search for "ftoa asm" yields some possibly useful results, but I'm not sure you want to go that far.
I'd say sprintf is pretty much the optimal way. You may prefer snprintf over it, but it doesn't have much to do with performance.
Herb Sutter has done an extensive study on the alternatives for converting an int to a string, but I would think his arguments hold for a double as well.
He looks at the balances between safety, efficiency, code clarity and usability in templates.
Read it here: http://www.gotw.ca/publications/mill19.htm
_gcvt or _gcvt_s.
If you use the Qt4 frame work you could go :
double d = 5.5;
QString num = QString::number(d);
This is very useful thread. I use sprintf_s for it but I started to doubt if it is really faster than other ways. I came across following document on Boost website which shows performance comparison between Printf/scanf, StringStream and Boost.
Double to String is most common conversion we do in our code, so i'll stick with what i've been using. But, using Boost in other scenarios could be your deciding factor.
http://www.boost.org/doc/libs/1_58_0/doc/html/boost_lexical_cast/performance.html
In the future, you can use std::to_chars to write code like https://godbolt.org/z/cEO4Sd . Unfortunately, only VS2017 and VS2019 support part of this functionality...
#include <iostream>
#include <charconv>
#include <system_error>
#include <string_view>
#include <array>
int main()
{
std::array<char, 10> chars;
auto [parsed, error] = std::to_chars(
chars.data(),
chars.data() + chars.size(),
static_cast<double>(12345.234)
);
std::cout << std::string_view(chars.data(), parsed - chars.data());
}
For a lengthy discussion on MSVC details, see
https://www.reddit.com/r/cpp/comments/a2mpaj/how_to_use_the_newest_c_string_conversion/eazo82q/