I have currently a memory issue using the Botan library (version 2.15) for cryptography functions within a C++ project. My development environment is Solus Linux 4.1 (kernel-current), but I could observe this issue on Debian Buster too.
I observed that some memory allocated internally by Botan for calculations is not deallocated when going out of scope. When I called Botan::HashFunction, Botan::StreamCipher and Botan::scrypt multiple times, always going out of scope in between, the memory footprint increases steadily.
For example, consider this code:
#include <iostream>
#include <vector>
#include "botan/scrypt.h"
void pause() {
char ch;
std::cout << "Insert any key to proceed... ";
std::cin >> ch;
}
std::vector<uint8_t> get_scrypt_passhash(std::string const& password, std::string const& name) {
std::vector<uint8_t> key (32);
Botan::scrypt(key.data(), key.size(), password.c_str(), password.length(), salt.c_str(), salt.length(), 65536, 32, 1);
std::cout << "From function: before closing.\n";
pause();
return key;
}
int main(int argc, char *argv[]) {
std::cout << "Beginning test.\n";
pause();
auto pwhashed = get_scrypt_passhash(argv[1], argv[2]);
std::cout << "Test ended.\n";
pause();
}
I used the pause() function to observe the memory consumption (I called top/pmap and observed KSysGuard during the pause), when it is called from within get_scrypt_passhash before terminating, the used memory (both by top/pmap and KSysGuard) is about 2 MB more than at beginning, and after terminating the same.
I tried to dive into the Botan source code, but I cannot find memory leaks or the like. Valgrind also outputted that all allocated bytes have been freed, so no memory leaks were possible.
Just for information, I tried the same functionality with Crypto++ without observing this behavior.
Has anyone experienced the same issue? Is there a way to fix it?
Related
I would like to have an example showing the difference between passing values to a function by value and by reference with filling of the memory.
The question here is: How can I monitor inside a C++ program how much stack/ heap memory is being used:
I have a recursive function which I hope is a good example:
#include <iostream>
#include <vector>
typedef std::vector<double> myVec;
void recursiveFunc(myVec n)
{
if (n[0] == 0)
{
//std::cout << "I am the last one" << std::endl;
return;
}
else
{
n[0] -= 1;
//std::cout << "I am here" << std::endl;
recursiveFunc(n);
}
}
int main()
{
myVec v2 = {1000000,2,3,4,5};
recursiveFunc(v2);
return 0;
}
How can I monitor inside a C++ program how much stack/ heap memory is being used
There is no standard way to measure how much stack or heap memory is being used in C++. System specific API exist. First step in using such API to find out what system you are programming for.
External tools exist to profile peak memory use (both stack and heap), such as valgrind.
We are under a PCI PA-DSS certification and one of its requirements is to avoid writing clean PAN (card number) to disk. The application is not writing such information to disk, but if the operating system (Windows, in this case) needs to swap, the memory contents is written to page file. Therefore the application must clean up the memory to prevent from RAM capturer services to read sensitive data.
There are three situations to handle:
heap allocation (malloc): before freeing the memory, the area can be cleaned up with memset
static or global data: after being used, the area can be cleaned up using memset
local data (function member): the data is put on stack and is not accessible after the function is finished
For example:
void test()
{
char card_number[17];
strcpy(card_number, "4000000000000000");
}
After test executes, the memory still contains the card_number information.
One instruction could zero the variable card_number at the end of test, but this should be for all functions in the program.
memset(card_number, 0, sizeof(card_number));
Is there a way to clean up the stack at some point, like right before the program finishes?
Cleaning the stack right when the program finishes might be too late, it could have already been swapped out during any point at its runtime. You should keep your sentitive data only in memory locked with VirtualLock so it does not get swapped out. This has to happen before said sensitive data is read.
There is a small limit on how much memory you can lock like this so you can propably not lock the whole stack and should avoid storing sensitive data on the stack at all.
I assume you want to get rid of this situation below:
#include <iostream>
using namespace std;
void test()
{
char card_number[17];
strcpy(card_number, "1234567890123456");
cout << "test() -> " << card_number << endl;
}
void test_trash()
{
// don't initialize, so get the trash from previous call to test()
char card_number[17];
cout << "trash from previous function -> " << card_number << endl;
}
int main(int argc, const char * argv[])
{
test();
test_trash();
return 0;
}
Output:
test() -> 1234567890123456
trash from previous function -> 1234567890123456
You CAN do something like this:
#include <iostream>
using namespace std;
class CardNumber
{
char card_number[17];
public:
CardNumber(const char * value)
{
strncpy(card_number, value, sizeof(card_number));
}
virtual ~CardNumber()
{
// as suggested by #piedar, memset_s(), so the compiler
// doesn't optimize it away.
memset_s(card_number, sizeof(card_number), 0, sizeof(card_number));
}
const char * operator()()
{
return card_number;
}
};
void test()
{
CardNumber cardNumber("1234567890123456");
cout << "test() -> " << cardNumber() << endl;
}
void test_trash()
{
// don't initialize, so get the trash from previous call to test()
char card_number[17];
cout << "trash from previous function -> " << card_number << endl;
}
int main(int argc, const char * argv[])
{
test();
test_trash();
return 0;
}
Output:
test() -> 1234567890123456
trash from previous function ->
You can do something similar to clean up memory on the heap or static variables.
Obviously, we assume the card number will come from a dynamic source instead of the hard-coded thing...
AND YES: to explicit answer the title of your question: The stack will not be cleaned automatically... you have to clean it by yourself.
I believe it is necessary, but this is only half of the problem.
There are two issues here:
In principle, nothing prevents the OS from swapping your data while you are still using it. As pointed out in the other answer, you want VirtualLock on windows and mlock on linux.
You need to prevent the optimizer from optimizing out the memset. This also applies to global and dynamically allocated memory. I strongly suggest to take a look at cryptopp SecureWipeBuffer.
In general, you should avoid to do it manually, as it is an error-prone procedure. Instead, consider using a custom allocator or a custom class template for secure data that can be freed in the destructor.
The stack is cleaned up by moving the stack pointer, not by actually popping values from it. The only mechanics are to pop the return into the appropriate registers. You must do it all manually. Also -- volatile can help you avoid optimizations on a per variable basis. You can manually pop the stack clean, but -- you need assembler to do that -- and it is not so simple to start manipulating the stack -- it is not actually your resource -- the compiler owns it as far as you are concerned.
We have a relatively large code base for a Linux server, it's dynamically linked-in libraries and server modules loaded during startup using dlopen(). The server as well as most of the other components are written in C++11, but some are in C99.
What approaches could one use to test whether the server, its dependencies and modules properly handle memory allocation failures, e.g.malloc/calloc returning NULL, operators new and new[] throwing std::bad_alloc etc, including allocation failures from std::string::resize() and such?
In the past, I've tried using memory allocation hooks to inject memory allocation failures into C applications, but I think these don't work for C++. What other options or approaches should I be looking at?
In fact, hooking into C malloc is enough, since under the hood the gcc C++ default implementation of operator new does call malloc and you confirmed you only needed a gcc compatible solution.
I could demonstrate it with that simple program:
mem.c++:
#include <iostream>
#include <string>
class A {
int ival;
std::string str;
public:
A(int i, std::string s): ival(i), str(s) {}
A(): ival(0), str("") {};
int getIval() const {
return ival;
}
std::string getStr() const {
return str;
}
};
int main() {
A a(2, "foo");
std::cout << &a << " : " << a.getIval() << " - " << a.getStr() << std::endl;
return 0;
}
memhook.c:
#include <stdio.h>
#include <stdlib.h>
extern void *__libc_malloc(size_t size);
void* malloc (size_t size) {
fprintf(stderr, "Allocating %u\n", size);
return NULL;
// return __libc_malloc(size);
}
When returning NULL (as above), the program displays:
Allocating 16
Allocating 100
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Abandon (core dumped)
That proves that returning NULL from the declared malloc function results in a std::bad_alloc exception in C++ code
When uncommenting the return __libc_malloc(size); the allocations are done by the libc malloc and the output becomes:
Allocating 16
0xbfe8d2e8 : 2 - foo
On linux you can hook into the operating system to force allocation to fail
man 2 mlockall
mlockall(MCL_CURRENT|MCL_FUTURE);
Should do what you want.
I've written a simple program returning the hostname of the IP address passed as an argument.
The program uses two functions: getaddrinfo() and getnameinfo().
I'm using Linux Mint, Netbeans IDE and the G++ compiler. The output is alright, there are no errors, but when I declare an
std::string str;
then cout gives no output, nothing is printed on the screen. However when I comment out the std::string declaration or remove it, the statement
std::cout << "hostname: " << hostname << std::endl;
prints the returned hostnames successfully.
What may be the cause of such a strange error?
#include <netdb.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <iostream>
#include <string>
int main()
{
struct addrinfo* result;
struct addrinfo* res;
int error;
const char* host;
// When I comment out this line, cout prints the hostnames succesfully.
std::string str;
error = getaddrinfo("127.0.0.1", NULL, NULL, &result);
res = result;
while (res != NULL)
{
char* hostname;
error = getnameinfo(res->ai_addr, res->ai_addrlen, hostname, 1025, NULL, 0, 0);
std::cout << "hostname: " << hostname << std::endl;
res = res->ai_next;
}
freeaddrinfo(result);
// When I declare an std::string str variable, this cout doesn't either print anything
std::cout << "TEST" << std::endl;
return 0;
}
The arguments host and serv are pointers to caller-
allocated buffers (of size hostlen and servlen respectively) into which
getnameinfo() places null-terminated strings containing the host and
service names respectively.
http://man7.org/linux/man-pages/man3/getnameinfo.3.html
Your pointers must be actually allocated. The fact that commenting out that line changes anything is probably a coincidence or a strange side effect of optimization.
Thanks, it works now :). I'd like to find out when to use different ways to allocate memory.
As far as I know the main difference between creating an object in the following way:
// Creating objects:
Test t1;
Test* t2 = new Test();
The first object will be created in a heap and it will be automatically deleted when the function is finished running.
The second object will be created in a stack and memory deallocation has to be done manually using the delete/delete[] operators?
So what else should I keep in mind when dealing with pointers?
I guess I need to read a good book about computer architecture, as knowledge about memory and microprocessors will give profits :)
For the sake of demonstration, I have created this simple console application:
#include <iostream>
class Person {
public:
int mAge;
};
int main(int argc, const char * argv[])
{
Person *iPerson = new Person();
iPerson->mAge = 15;
std::cout << "Age: " << iPerson->mAge;
return 0;
}
Now I'm aware that Valgrind and CPP Check will identify leaks here, but testing Apple's Instruments, When I profile this code I can't see any leaks. This is despite iPerson never being deleted.
I've worked it out:
I had to set the snapshot interval to 1 second.
I had to disable (set to None) Optimization for the release version (for which profiling is done).
Then based on justin's reply and this question, I had to modify my code like so:
#include <iostream>
#include <unistd.h>
class Person {
public:
int mAge;
};
void CreateLeaks()
{
// All three lines will generate a leak.
Person *iPerson = new Person();
iPerson = new Person();
iPerson = new Person();
}
int main(int argc, const char * argv[])
{
CreateLeaks();
sleep( 2 );
return 0;
}
There are still some odd things going on. For example, if you start adding sleep(2) within CreateLeaks, Instruments doesn't catch all leaks (depending on where you put the sleep commands. Odd.
Leaks Instrument performs snapshots at a predefined frequency. By default, that value is "every 10 seconds". You program completes before 10 seconds. Thus, the leak is never collected. So you must suspend execution after iPerson has gone out of scope in order for that leak to be detected. Also, if you just add a sleep while that pointer is still referenced on the stack or in a register, then it won't be a leak.
You could have a look at Tips for Improving Leak Detection from the Mac Developer Library.
Cppcheck static analysis tool for C/C++ code might also help. For the example you provided, it finds:
#>cppcheck so_code.cpp
Checking so_code.cpp...
[so_code.cpp:15]: (error) Memory leak: iPerson