This program (it has been narrowed down from a larger program) always crashes after compiled in vs2008 Release(Win32) mode under windows 7. I am not familiar with assembly code and don't know it's a bug of compiler or boost::ends_with or boost::asio::buffers_iterator. It can be compiled and executed with g++ in Ubuntu without any problem.
People said it's very unlikely to be compiler's bug, but when compiled in debug moded(or disable optimization), the problem does disappear.
I have been stuck with this problem for quite a few hours. Any help is appreciated. Thanks in advance.
#include <iostream>
#include <string>
#include <boost/asio.hpp>
#include <boost/algorithm/string.hpp>
typedef boost::asio::buffers_iterator<boost::asio::const_buffers_1> iterator_t;
typedef boost::iterator_range<iterator_t> range_t;
static const std::string LINE_END_MARK = "\r\n";
int main(int argc, char* argv[])
{
boost::asio::streambuf _buf;
std::ostream os(&_buf);
os<<"END\r\n";
iterator_t cursor = boost::asio::buffers_begin(_buf.data());
iterator_t end = boost::asio::buffers_end(_buf.data());
std::ostream_iterator<char> it(std::cout," ");
std::copy(LINE_END_MARK.begin(), LINE_END_MARK.end(), it);
range_t r(cursor, end);
if(!boost::ends_with(r, LINE_END_MARK))
return 0;
return 1;
}
Edit: I misread the code, sorry.
Your cursor and end iterators are pointing to invalid memory. You modified the underlying streambuf which reallocated during the copy to the output iterator. asio streambuf lets you access the raw memory for performance reasons, but the caveat is you have to worry about things like this.
Debug and release will change the way allocation and deallocation behave with regards to the underlying size of the allocated block and how memory may be fenced, guarded, initialized, aligned, etc.
Construct your iterators after the copy operation to fix your problem.
It doesn't work because 'range_t r(cursor, end)' is a range of "buffers" not a range of characters. So you are comparing a list of buffer pointers with each character in LINE_END_MARK.
If crashes in release mode under win32 because in windows you end up de-referencing a null pointer causing the crash.
boost asio has this concept of multiple buffers, but currently it's not really used. If you look at the implementation if only really uses 'const_buffers_1' or 'mutable_buffers_1' which is basically a list of 1 buffer.
I assume you want to compare the contents of the buffer, not a list of buffers range.
So you want to do something like:
typedef boost::iterator_range<const char*> range_t;
range_t r(boost::asio::buffer_cast<const char*>(_buf.data()), boost::asio::buffer_cast<const char*>(_buf.data()) + boost::asio::buffer_size(_buf.data()));
if(!boost::ends_with(r, LINE_END_MARK))
return 0;
return 1;
Related
How does one store sensitive data (ex: passwords) in std::string?
I have an application which prompts the user for a password and passes it to a downstream server during connection setup. I want to securely clear the password value after the connection has been established.
If I store the password as a char * array, I can use APIs like SecureZeroMemory to get rid of the sensitive data from the process memory. However, I want to avoid char arrays in my code and am looking for something similar for std::string?
Based on the answer given here, I wrote an allocator to securely zero memory.
#include <string>
#include <windows.h>
namespace secure
{
template <class T> class allocator : public std::allocator<T>
{
public:
template<class U> struct rebind { typedef allocator<U> other; };
allocator() throw() {}
allocator(const allocator &) throw() {}
template <class U> allocator(const allocator<U>&) throw() {}
void deallocate(pointer p, size_type num)
{
SecureZeroMemory((void *)p, num);
std::allocator<T>::deallocate(p, num);
}
};
typedef std::basic_string<char, std::char_traits<char>, allocator<char> > string;
}
int main()
{
{
secure::string bar("bar");
secure::string longbar("baaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaar");
}
}
However, it turns out, depending on how std::string is implemented, it is possible that the allocator isn't even invoked for small values. In my code, for example, the deallocate doesn't even get called for the string bar (on Visual Studio).
The answer, then, is that we cannot use std::string to store sensitive data. Of course, we have the option to write a new class that handles the use case, but I was specifically interested in using std::string as defined.
Thanks everyone for your help!
openssl went through a couple of iterations of securely erasing a string until it settled on this approach:
#include <string.h>
#include <string>
// Pointer to memset is volatile so that compiler must de-reference
// the pointer and can't assume that it points to any function in
// particular (such as memset, which it then might further "optimize")
typedef void* (*memset_t)(void*, int, size_t);
static volatile memset_t memset_func = memset;
void cleanse(void* ptr, size_t len) {
memset_func(ptr, 0, len);
}
int main() {
std::string secret_str = "secret";
secret_str.resize(secret_str.capacity(), 0);
cleanse(&secret_str[0], secret_str.size());
secret_str.clear();
return 0;
}
It is a complicated topic, as an optimizing compiler will work against you. Straightforward approaches like looping over the string and overwriting each character are not reliable, as the compiler might optimize it away. Same with memset, however, C11 added memset_s, which should be secure but might not be available on all platforms.
For that reason, I would strongly recommend to use a trusted crypto library for that task and let their authors take care of portability. Secure wiping is a basic operation (taking a C-array and overwriting it securely), which all libraries will have to implement at some point. Note that the underlying data in a std::string is contiguous (as mandated by the C++11 standard, but in practice even in C++98/03 you could assume it). Therefore, you can use the secure wiping facilities of the crypto library by treading the std::string as an array.
In OpenSSL, secure wiping is provided by the OPENSSL_cleanse function. Crypto++ has memset_z:
std::string secret;
// ...
// OpenSSL (#include <openssl/crypto.h> and link -lcrypto)
OPENSSL_cleanse(&secret[0], secret_str.size());
// Crypto++ (#include <crypto++/misc.h> and link -lcrypto++)
CryptoPP::memset_z(&secret[0], 0, secret.size());
As a side-note, if you design the API from scratch, consider avoiding std::string altogether when it comes to storing secrets. It was not a design goal of std::string to prevent leaking the secret (or parts of it during resizing or copying).
For posterity, I once decided to ignore this advice and use std::string anyway, and wrote a zero() method using c_str() (and casting away the constness) and volatile. If I was careful and didn't cause a reallocate/move of the contents, and I manually called zero() where I needed it clean, all seemed to function properly. Alas, I discovered another serious flaw the hard way: std::string can also be a referenced-counted object... blasting the memory at c_str() (or the memory the referenced object is pointing to) will unknowingly blast the other object.
For Windows:
std::string s("ASecret");
const char* const ptr = s.data();
SecureZeroMemory((void*)ptr, s.size());
This shall securely clear the data from the stack or the heap depending on the STL internals.
Works on all sizes of the string no matter small or large.
Caution !
DO NOT USE ptr for altering the data of the string which might result in increasing or decreasing the length.
std::string is based on a char*. Somewhere behind all the dynamic magic as a char*. So when you say you don't want to use char*'s on your code, you are still using a char*, it's just in the background with a whole bunch of other garbage piled on top of it.
I'm not too experienced with process memory, but you could always iterate through each character (after you've encrypted and stored the password in a DB?), and set it to a different value.
There's also a std::basic_string, but I'm not sure what help that would do for you.
std::string mystring;
...
std::fill(mystring.begin(), mystring.end(), 0);
or even better write your own function:
void clear(std::string &v)
{
std::fill(v.begin(), v.end(), 0);
}
Consider a scenario, where std::string is used to store a secret. Once it is consumed and is no longer needed, it would be good to cleanse it, i.e overwrite the memory that contained it, thus hiding the secret.
std::string provides a function const char* data() returning a pointer to (since C++11) continous memory.
Now, since the memory is continous and the variable will be destroyed right after the cleanse due to scope end, would it be safe to:
char* modifiable = const_cast<char*>(secretString.data());
OpenSSL_cleanse(modifiable, secretString.size());
According to standard quoted here:
$5.2.11/7 - Note: Depending on the type of the object, a write operation through the pointer, lvalue or pointer to data member resulting from a const_cast that casts away a const-qualifier68 may produce undefined behavior (7.1.5.1).
That would advise otherwise, but do the conditions above (continuous, to-be-just-removed) make it safe?
The standard explicitly says you must not write to the const char* returned by data(), so don't do that.
There are perfectly safe ways to get a modifiable pointer instead:
if (secretString.size())
OpenSSL_cleanse(&secretString.front(), secretString.size());
Or if the string might have been shrunk already and you want to ensure its entire capacity is wiped:
if (secretString.capacity()) {
secretString.resize(secretString.capacity());
OpenSSL_cleanse(&secretString.front(), secretString.size());
}
It is probably safe. But not guaranteed.
However, since C++11, a std::string must be implemented as contiguous data so you can safely access its internal array using the address of its first element &secretString[0].
if(!secretString.empty()) // avoid UB
{
char* modifiable = &secretString[0];
OpenSSL_cleanse(modifiable, secretString.size());
}
std::string is a poor choice to store secrets. Since strings are copyable and sometimes copies go unnoticed, your secret may "get legs". Furthermore, string expansion techniques may cause multiple copies of fragments (or all of) your secrets.
Experience dictates a movable, non-copyable, wiped clean on destroy, unintelligent (no tricky copies under-the-hood) class.
You can use std::fill to fill the string with trash:
std::fill(str.begin(),str.end(), 0);
Do note that simply clearing or shrinking the string (with methods such clear or shrink_to_fit) does not guarantee that the string data will be deleted from the process memory. Malicious processes may dump the process memory and can extract the secret if the string is not overwritten correctly.
Bonus: Interestingly, the ability to trash the string data for security reasons forces some programming languages like Java to return passwords as char[] and not String. In Java, String is immutable, so "trashing" it will make a new copy of the string. Hence, you need a modifiable object like char[] which does not use copy-on-write.
Edit: if your compiler does optimize this call out, you can use specific compiler flags to make sure a trashing function will not be optimized out:
#ifdef WIN32
#pragma optimize("",off)
void trashString(std::string& str){
std::fill(str.begin(),str.end(),0);
}
#pragma optimize("",on)
#endif
#ifdef __GCC__
void __attribute__((optimize("O0"))) trashString(std::string& str) {
std::fill(str.begin(),str.end(),0);
}
#endif
#ifdef __clang__
void __attribute__ ((optnone)) trashString(std::string& str) {
std::fill(str.begin(),str.end(),0);
}
#endif
There's a better answer: don't!
std::string is a class which is designed to be userfriendly and efficient. It was not designed with cryptography in mind, so there are few guarantees written into it to help you out. For example, there's no guarantees that your data hasn't been copied elsewhere. At best, you could hope that a particular compiler's implementation offers you the behavior you want.
If you actually want to treat a secret as a secret, you should handle it using tools which are designed for handling secrets. In fact, you should develop a threat model for what capabilities your attacker has, and choose your tools accordingly.
Tested solution on CentOS 6, Debian 8 and Ubuntu 16.04 (g++/clang++, O0, O1, O2, O3):
secretString.resize(secretString.capacity(), '\0');
OPENSSL_cleanse(&secretString[0], secretString.size());
secretString.clear();
If you were really paranoid you could randomise the data in the cleansed string, so as not to give away the length of the string or a location that contained sensitive data:
#include <string>
#include <stdlib.h>
#include <string.h>
typedef void* (*memset_t)(void*, int, size_t);
static volatile memset_t memset_func = memset;
void cleanse(std::string& to_cleanse) {
to_cleanse.resize(to_cleanse.capacity(), '\0');
for (int i = 0; i < to_cleanse.size(); ++i) {
memset_func(&to_cleanse[i], rand(), 1);
}
to_cleanse.clear();
}
You could seed the rand() if you wanted also.
You could also do similar string cleansing without openssl dependency, by using explicit_bzero to null the contents:
#include <string>
#include <string.h>
int main() {
std::string secretString = "ajaja";
secretString.resize(secretString.capacity(), '\0');
explicit_bzero(&secretString[0], secretString.size());
secretString.clear();
return 0;
}
I am working on a embedded SW project. A lot of strings are stored inside flash memory. I would use these strings (usually const char* or const wchar*) as std::string's data. That means I want to avoid creating a copy of the original data because of memory restrictions.
An extended use might be to read the flash data via stringstream directly out of the flash memory.
Example which unfortunately is not working in place:
const char* flash_adr = 0x00300000;
size_t length = 3000;
std::string str(flash_adr, length);
Any ideas will be appreciated!
If you are willing to go with compiler and library specific implementations, here is an example that works in MSVC 2013.
#include <iostream>
#include <string>
int main() {
std::string str("A std::string with a larger length than yours");
char *flash_adr = "Your source from the flash";
char *reset_adr = str._Bx._Ptr; // Keep the old address around
// Change the inner buffer
(char*)str._Bx._Ptr = flash_adr;
std::cout << str << std::endl;
// Reset the pointer or the program will crash
(char*)str._Bx._Ptr = reset_adr;
return 0;
}
It will print Your source from the flash.
The idea is to reserve a std::string capable of fitting the strings in your flash and keep on changing its inner buffer pointer.
You need to customize this for your compiler and as always, you need to be very very careful.
I have now used string_span described in CPP Core Guidelines (https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md). GSL provides a complete implementation (GSL: Guidelines Support Library https://github.com/Microsoft/GSL).
If you know the address of your string inside flash memory you can just use the address directly with the following constructor to create a string_span.
constexpr basic_string_span(pointer ptr, size_type length) noexcept
: span_(ptr, length)
{}
std::string_view might have done the same job as Captain Obvlious (https://stackoverflow.com/users/845568/captain-obvlious) commented as my favourite comment.
I am quite happy with the solution. It works good from performance side including providing a good readability.
Many windows APIs take a pointer to a buffer and a size element but the result needs to go into a c++ string. (I'm using windows unicode here so they are wstrings)
Here is an example :-
#include <iostream>
#include <string>
#include <vector>
#include <windows.h>
using namespace std;
// This is the method I'm interested in improving ...
wstring getComputerName()
{
vector<wchar_t> buffer;
buffer.resize(MAX_COMPUTERNAME_LENGTH+1);
DWORD size = MAX_COMPUTERNAME_LENGTH;
GetComputerNameW(&buffer[0], &size);
return wstring(&buffer[0], size);
}
int main()
{
wcout << getComputerName() << "\n";
}
My question really is, is this the best way to write the getComputerName function so that it fits into C++ better, or is there a better way? I don't see any way to use a string directly without going via a vector unless I missed something? It works fine, but somehow seems a little ugly. The question isn't about that particular API, it's just a convenient example.
In this case, I don't see what std::vector brings to the party. MAX_COMPUTERNAME_LENGTH is not likely to be very large, so I would simply use a C-style array as the temporary buffer.
See this answer to another question. It provides the source to a StringBuffer class which handles this situation very cleanly.
I would say, since you are already at task of abstracting Windows API behind a more generic C++ interface, do away with vector altogether, and don't bother about wstring constructor:
wstring getComputerName()
{
wchar_t name[MAX_COMPUTERNAME_LENGTH + 1];
DWORD size = MAX_COMPUTERNAME_LENGTH;
GetComputerNameW(name, &size);
return name;
}
This function will return a valid wstring object.
I'd use the vector. In response to you saying you picked a bad example, pretend for a moment that we don't have a reasonable constant upper bound on the string length. Then it's not quite as easy:
#include <string>
#include <vector>
#include <windows.h>
using std::wstring;
using std::vector;
wstring getComputerName()
{
DWORD size = 1; // or a bigger number if you like
vector<wchar_t> buffer(size);
while ((GetComputerNameW(&buffer[0], &size) == 0))
{
if (GetLastError() != ERROR_BUFFER_OVERFLOW) aargh(); // handle error
buffer.resize(++size);
};
return wstring(&buffer[0], size);
}
In practice, you can probably get away with writing into a string, but I'm not entirely sure. You certainly need additional guarantees made by your implementation of std::wstring, beyond what's in the standard, but I expect MSVC's strings are probably OK.
I think that if wstring::reference is wchar_t& then you're sorted. 21.3.4 defines that non-const operator[] returns a reference, and that it returns data()[pos]. So if reference is just a plain wchar_t& then there's no scope for exciting copy-on-write behaviour through the reference, and the string must in fact be modifiable through the pointer &buffer[0]. I think. The basic problem here is that the standard allowed implementations more flexibility than turned out to be needed.
That's a lot of effort and commenting though, just to avoid copying a string, so I've never felt the need to avoid an intermediate array/vector.
The code below compiled in Debug configuration in VS2005 SP1 shows two messages with “ITERATOR LIST CORRUPTED” notice.
Code Snippet
#define _SECURE_SCL 0
#define _HAS_ITERATOR_DEBUGGING 0
#include <sstream>
#include <string>
int main()
{
std::stringstream stream;
stream << "123" << std::endl;
std::string str = stream.str();
std::string::const_iterator itFirst = str.begin();
int position = str.find('2');
std::string::const_iterator itSecond = itFirst + position;
std::string tempStr(itFirst,itSecond); ///< errors are here
return 0;
}
Is it a bug in the compiler or standard library?
My bad! Edit: Yeah problem with compiler. See this -- particularly the Community Content section.
What #dirkgently said in his edit.
Apparently, some code for std::string is located in the runtime dll, in particular the macro definition does not take effect for the constructor an the code for iterator debugging gets executed. You can fix this by linking the runtime library statically.
I would consider this a bug, though perhaps not in the Visual Studio itself, but in the documentation.
There is a problem with your code. Well, several in fact:
std.find('2') returns a size_t, you have a potential cast problem if the value of the size_t returned (like std::string::npos) is superior to what an int can hold (you would end up with a negative int I think...)
if position is negative, or equal to std::string::npos then the range itFirst,itSecond is ill-defined (either because itSecond is before itFirst or because it is past str.end())
Correct your code, and check if it stills throw. Iterator Debugging is here to help you catch these mistakes, disabling it acting like an ostrich.