C++ memory leak that I can't find - c++

I have a small example program here for the particle photon that has a memory bug that I cannot figure out.
What it does: loads up a buffer with small string chunks, converts that large buffer back into a string. Then it creates a bunch of objects that are only wrappers for small chunks of buffer. It does this repetitively, and I don't allocate any new memory after the setup(), yet the memory goes down slowly until it crashes.
main.cpp
includes, variable declarations
#include "application.h" //needed when compiling spark locally
#include <string>
#include <unordered_map>
#include "dummyclass.h"
using namespace std;
SYSTEM_MODE(MANUAL);
char* buffer;
unordered_map<int, DummyClass*> store;
string alphabet;
unsigned char alphabet_range;
unsigned char state;
int num_chars;
static const unsigned char STATE_INIT = 0;
static const unsigned char STATE_LOAD_BUFFER = 1;
static const unsigned char STATE_PREP_FOR_DESERIALIZE = 2;
static const unsigned char STATE_FAKE_DESERIALIZE = 3;
static const unsigned char STATE_FINISH_RESTART = 4;
delete objects helper function
bool delete_objects()
{
Serial.println("deleting objects in 'store'");
for(auto iter = store.begin(); iter != store.end(); iter++)
{
delete iter->second;
iter->second = nullptr;
}
store.clear();
if(store.empty())
return true;
else
return false;
}
set up function, allocates memory, initial assignments
void setup()
{
Serial.begin(9600);
Serial1.begin(38400);
delay(2000);
buffer = new char[9000];
alphabet = string("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789~!##$^&*()_-?/><[]{}|");
alphabet_range = alphabet.length() - 1;
state = STATE_INIT;
num_chars = 0;
}
loop function, gets run over and over
void loop()
{
switch(state){
case STATE_INIT: {
strcpy(buffer, "");
state = STATE_LOAD_BUFFER;
delay(1000);
break;
}
case STATE_LOAD_BUFFER: {
if(num_chars < 6000){
string chunk;
for(char i = 0; i < 200; i++){
int index = rand() % alphabet_range;
chunk.append(alphabet.substr(index, 1));
num_chars++;
}
strcat(buffer, chunk.c_str());
}
else{
num_chars = 0;
state = STATE_PREP_FOR_DESERIALIZE;
}
delay(500);
break;
}
case STATE_PREP_FOR_DESERIALIZE: {
Serial.println("\nAttempting to delete current object set...");
delay(500);
if(delete_objects())
Serial.println("_delete_objects succeeded");
else {
Serial.println("_delete_objects failed");
break;
}
state = STATE_FAKE_DESERIALIZE;
delay(1000);
break;
}
case STATE_FAKE_DESERIALIZE: {
string buff_string(buffer);
if(buff_string.length() == 0){
Serial.println("Main:: EMPTY STRING CONVERTED FROM BUFFER");
}
int index = 0;
int key = 1;
while(index < buff_string.length())
{
int amount = (rand() % 50) + 5;
DummyClass* dcp = new DummyClass(buff_string.substr(index, amount));
store[key] = dcp;
index += amount;
key++;
}
state = STATE_FINISH_RESTART;
delay(1000);
break;
}
case STATE_FINISH_RESTART: {
state = STATE_INIT;
break;
}
}
}
dummyclass.h
very minimal, constructor just stores a string in a character buffer. this object is just a wrapper.
using namespace std;
class DummyClass {
private:
char* _container;
public:
DummyClass(){
}
DummyClass(string input){
_container = new char[input.length()];
strcpy(_container, input.c_str());
}
~DummyClass(){
delete _container;
_container = nullptr;
}
char* ShowMeWhatYouGot(){
return _container;
}
};
EDIT:
This is a real problem that I am having, I'm not sure why it is getting downvoted. Help me out here, how can I be more clear? I'm reluctant to shrink the code since it imitates many aspects of a much bigger program that it is modeling simply. I want to keep the structure of the code in place in case this bug is an emergent property.

Always account for the string terminator:
DummyClass(string input){
_container = new char[input.length()];
strcpy(_container, input.c_str());
}
Allocates one too few bytes to hold the input string and terminator that is then copied into it. The \0that's appended at the end is overwriting something, which is most likely metadata required to re-integrate the alloced memory fragment back into the heap successfully. I'm actually surprised it didn't crash...
It probably doesn't happen every allocation (only when you overflow into a new 8 byte aligned chunk), but once is enough :)

So, after some testing, I'd like to give a shout out to Russ Schultz who commented the right answer. If you want to post a solution formally, I would be happy to mark it as correct.
The memory bug is caused by allocating the char buffer _container without considering the null terminating character, meaning I am loading in a string that is too big. (not entirely sure why this causes a bug and doesn't throw an error?)
On a different site however, I also received this piece of advice:
string chunk;
for(char i = 0; i < 200; i++){
int index = rand() % alphabet_range;
chunk.append(alphabet.substr(index, 1));
// strcat(buffer, alphabet.substring(index, index + 1));
num_chars++;
}
This loop looks suspect to me. You are depending on the string append method to grow chunk as needed, but you know you are going to run that loop 200 times. Why not use the string reserve method to just allocate that much space? I bet that this chews up a lot of memory with each new char you append calling realloc, potentially fragmenting memory.
This ended up not being the solution, but it might be good to know.

Related

My code is right but not accepted by Leetcode platoform. (ZigZag Conversion)

It is a leet code problem under the subcategory of string, medium problem.
Query: My program is returning right result for all the test cases at the run time and but when I submit, same test cases are not passing.
I also made a video, click here to watch.
My Code is:
string convert(string s, int numRows) {
int loc_rows = numRows-2;
int i=0;
int a=0,b=0;
int arr[1000][1000];
while(i<s.length())
{
if(a<numRows)
{
arr[a][b] = s[i];
a++;
i++;
}
else if(a>=numRows)
{
if(loc_rows>=1)
{
b++;
arr[loc_rows][b]=s[i];
i++;
loc_rows--;
}
else{
loc_rows=numRows-2;
b++;
a=0;
}
}
}
string result="";
for(int d=0;d<numRows;d++)
{
for(int y=0;y<b+1;y++)
{
char temp = (char)arr[d][y];
if((temp>='a' and temp<='z') or (temp>='A' and temp<='Z') )
result+=temp;
}
}
return result;
}
I believe the issue might be your un-initialised arrays / variables.
Try setting initialising your array: int arr[1000][1000] = {0};
live example failing: https://godbolt.org/z/dxf13P
live example passing: https://godbolt.org/z/8vYEv6
You can't rely on the data that is in these arrays so initialising the values is quite important.
Note: this is because you rely on the empty values in the array to be not a letter ([a-zA-Z]). So that you can re-construct your output with your final loop which attempts to print the characters only. This works the first time around because luckily arr contains 0's in the gaps between your values (or at least not letters). The second time around it contains some junk from the first time around (really - you don't know what this is going to be, but in practise it is probably just the values you left in there from last time). So even though you put in the correct values into arr each time - your final loop finds some of the old non-alpha values in the array - hence your program is incorrect...
Alternatively, we could also use unsigned int to make it just a bit more efficient:
// The following block might slightly improve the execution time;
// Can be removed;
static const auto __optimize__ = []() {
std::ios::sync_with_stdio(false);
std::cin.tie(NULL);
std::cout.tie(NULL);
return 0;
}();
// Most of headers are already included;
// Can be removed;
#include <cstdint>
#include <vector>
#include <string>
static const struct Solution {
using ValueType = std::uint_fast16_t;
static const std::string convert(
const std::string s,
const int num_rows
) {
if (num_rows == 1) {
return s;
}
std::vector<std::string> res(num_rows);
ValueType row = 0;
ValueType direction = -1;
for (ValueType index = 0; index < std::size(s); ++index) {
if (!(index % (num_rows - 1))) {
direction *= -1;
}
res[row].push_back(s[index]);
row += direction;
}
std::string converted;
for (const auto& str : res) {
converted += str;
}
return converted;
}
};

Modifying integer value from another process using process_vm_readv

I am using Ubuntu Linux to write two programs. I am attempting to change the value of an integer from another process. My first process (A) is a simple program that loops forever and displays the value to the screen. This program works as intended and simply displays the value -1430532899 (0xAABBCCDD) to the screen.
#include <stdio.h>
int main()
{
//The needle that I am looking for to change from another process
int x = 0xAABBCCDD;
//Loop forever printing out the value of x
int counter = 0;
while(1==1)
{
while(counter<100000000)
{
counter++;
}
counter = 0;
printf("%d",x);
fflush(stdout);
}
return 0;
}
In a separate terminal, I use the ps -e command to list the processes and note the process id for process (A). Next as root use (sudo) I run this next program (B) and enter in the process ID that I noted from process (A).
The program basically searches for the needle which is in memory backwards (DD CC BB AA) find the needle, and takes note of the address. It then goes and tries to write the hex value (0xEEEEEEEE) to that same location, but I get a bad address error when errno is set to 14. The strange thing is a little later in the address space, I am able to write the values successfully to the address (0x601000) but the address where the needle(0xAABBCCDD) is at 0x6005DF I cannot write there. (But can read obviously because that is where I found the needle)
#include <stdio.h>
#include <iostream>
#include <sys/uio.h>
#include <string>
#include <errno.h>
#include <vector>
using namespace std;
char getHex(char value);
string printHex(unsigned char* buffer, int length);
int getProcessId();
int main()
{
//Get the process ID of the process we want to read and write
int pid = getProcessId();
//Lists of addresses where we find our needle 0xAABBCCDD and the addresses where we simply cannot read
vector<long> needleAddresses;
vector<long> unableToReadAddresses;
unsigned char buf1[1000]; //buffer used to store memory values read from other process
//Number of bytes read, also is -1 if an error has occurred
ssize_t nread;
//Structures used in the process_vm_readv system call
struct iovec local[1];
struct iovec remote[1];
local[0].iov_base = buf1;
local[0].iov_len = 1000;
remote[0].iov_base = (void * ) 0x00000; //start at address 0 and work up
remote[0].iov_len = 1000;
for(int i=0;i<10000;i++)
{
nread = process_vm_readv(pid, local, 1, remote, 1 ,0);
if(nread == -1)
{
//errno is 14 then the problem is "bad address"
if(errno == 14)
unableToReadAddresses.push_back((long)remote[0].iov_base);
}
else
{
cout<<printHex(buf1,local[0].iov_len);
for(int j=0;j<1000-3;j++)
{
if(buf1[j] == 0xDD && buf1[j+1] == 0xCC && buf1[j+2] == 0xBB && buf1[j+3] == 0xAA)
{
needleAddresses.push_back((long)(remote[0].iov_base+j));
}
}
}
remote[0].iov_base += 1000;
}
cout<<"Addresses found at...";
for(int i=0;i<needleAddresses.size();i++)
{
cout<<needleAddresses[i]<<endl;
}
//How many bytes written
int nwrite = 0;
struct iovec local2[1];
struct iovec remote2[1];
unsigned char data[] = {0xEE,0xEE,0xEE,0xEE};
local2[0].iov_base = data;
local2[0].iov_len = 4;
remote2[0].iov_base = (void*)0x601000;
remote2[0].iov_len = 4;
for(int i=0;i<needleAddresses.size();i++)
{
cout<<"Attempting to write "<<printHex(data,4)<<" to address "<<needleAddresses[i]<<endl;
remote2[0].iov_base = (void*)needleAddresses[i];
nwrite = process_vm_writev(pid,local2,1,remote2,1,0);
if(nwrite == -1)
{
cout<<"Error writing to "<<needleAddresses[i]<<endl;
}
else
{
cout<<"Successfully wrote data";
}
}
//For some reason THIS will work
remote2[0].iov_base = (void*)0x601000;
nwrite = process_vm_writev(pid,local2,1,remote2,1,0);
cout<<"Wrote "<<nwrite<<" Bytes to the address "<<0x601000 <<" "<<errno;
return 0;
}
string printHex(unsigned char* buffer, int length)
{
string retval;
char temp;
for(int i=0;i<length;i++)
{
temp = buffer[i];
temp = temp>>4;
temp = temp & 0x0F;
retval += getHex(temp);
temp = buffer[i];
temp = temp & 0x0F;
retval += getHex(temp);
retval += ' ';
}
return retval;
}
char getHex(char value)
{
if(value < 10)
{
return value+'0';
}
else
{
value = value - 10;
return value+'A';
}
}
int getProcessId()
{
int data = 0;
printf("Please enter the process id...");
scanf("%d",&data);
return data;
}
Bottom line is that I cannot modify the repeating printed integer from another process.
I can see at least these problems.
No one guarantees there's 0xAABBCCDD anywhere in the writable memory of the process. The compiler can optimize it away entirely, or put in in a register. One way to enssure a variable will be placed in the main memory is to declare it volatile.
volatile int x = 0xAABBCCDDEE;
No one guarantees there's no 0xAABBCCDD somewhere in the read-only memory of the process. On the contrary, one could be quite certain there is in fact such a value there. Where else could the program possibly obtain it to initialise the variable? The initialisation probably translates to an assembly instruction similar to this
mov eax, 0xAABBCCDD
which, unsurprisingly, contains a bit pattern that matches 0xAABBCCDD. The address 0x6005DF could well be in the .text section. It is extremely unlikely it is on the stack, because stack addresses are typically close to the top of the address space.
The address space of a 64-bit process is huge. There is no hope to traverse it all in a reasonable amount of time. One needs to limit the range of addresses somehow.

Dealing with accessing NULL pointer

In my class, I've got - inter alia - a pointer:
Class GSM
{
//...
private:
char *Pin;
//...
}
My constructor initialize it as:
GSM::GSM()
{
//...
Pin = NULL;
//...
}
Now, I want to set default value ("1234") to my PIN. I tried very simple way:
bool GSM::setDefaultValue()
{
lock();
Pin = "0";
for (uint8 i =0; i < 4; ++i)
{
Pin[i] = i+1;
}
unlock();
return true;
}
But it didn't work. When I run my program (I use Visual Studio 2010) there is an error:
Access violation writing location 0x005011d8
I tried to remove line
Pin = "0";
But it didn't help. I have to initialize it as NULL in constructor. It's part of a larger project, but I think, the code above is what makes me trouble. I'm still pretty new in C++/OOP and sometimes I still get confused by pointers.
What should I do to improve my code and the way I think?
EDIT: As requested, I have to add that I can't use std::string. I'm trainee at company, project is pretty big (like thousands of files) and I did not see any std here and I'm not allowed to use it.
You need to give the Pin some memory. Something like this:
Pin = new char[5]; // To make space for terminating `\0`;
for(...)
{
Pin[i] = '0' + i + 1;
}
Pin[4] = '\0'; // End of the string so we can use it as a string.
...
You should then use delete [] Pin; somewhere too (Typically in the destructor of the class, but depending on how it's used, it may be needed elsewhere, such as assignment operator, and you need to also write a copy-constructor, see Rule Of Three).
In proper C++, you should use std::string instead, and you could then do:
Class GSM
{
//...
private:
std::string Pin;
....
Pin = "0000";
for (uint8 i =0; i < 4; ++i)
{
Pin[i] += i+1;
}
Using std::string avoids most of the problems of allocating/deallocating memory, and "just works" when you copy, assign or destroy the class - because the std::string implementation and the compiler does the work for you.
You need to allocate a block of memory to store "1234". This memory block will be pointed by your Pin pointer.
You can try:
bool GSM::setDefaultValue()
{
lock();
Pin = new char[4];
for (uint8 i =0; i < 4; ++i)
{
Pin[i] = '0' + (i + 1);
}
unlock();
return true;
}
As you have allocated dynamicaly a memory block, you should always release it when you don't need it anymore. To do so, you should add a destructor to your class:
GSM::~GSM()
{
delete [] Pin;
}
Simple answer:
Instead of using the heap (new delete) just allocate space in your class for the four character pin:
Class GSM
{
//...
private:
char Pin[5];
//...
}
The length is fixed at 5 (to allow space for 4 characters and terminating null ('\0'), but as long as you only need to store a maximum of 4 characters, you are fine.
Of course, if you want to make it easy to change in the future:
Class GSM
{
//...
private:
const int pin_length = 4;
char Pin[pin_length+1];
//...
}
Your function to set the value will then look like:
bool GSM::setDefaultValue()
{
lock();
for (char i = 0; i < pin_length; ++i)
{
Pin[i] = i+1;
}
Pin[pin_length]=0;
unlock();
return true;
}
or even:
bool GSM::setDefaultValue()
{
lock();
strcpy(Pin,"1234"); //though you would have to change this if the pin-length changes.
unlock();
return true;
}

Return string read from buffer and function without dynamic allocation?

How would I go about returning a string built from a buffer within a function without dynamically allocating memory?
Currently I have this function to consider:
// Reads null-terminated string from buffer in instance of buffer class.
// uint16 :: unsigned short
// ubyte :: unsigned char
ubyte* Readstr( void ) {
ubyte* Result = new ubyte[]();
for( uint16 i = 0; i < ByteSize; i ++ ) {
Result[ i ] = Buffer[ ByteIndex ];
ByteIndex ++;
if ( Buffer[ ByteIndex - 1 ] == ubyte( 0 ) ) {
ByteIndex ++;
break;
};
};
return Result;
};
While I can return the built string, I can't do this without dynamic allocation. This becomes a problem if you consider the following usage:
// Instance of buffer class "Buffer" calling Readstr():
cout << Buffer.Readstr() << endl;
// or...
ubyte String[] = Buffer.String();
Usages similar to this call result in the same memory leak as the data is not being deleted via delete. I don't think there is a way around this, but I am not entirely sure if it's possible.
Personally, I'd recommend just return std::string or std::vector<T>: this neatly avoids memory leaks and the string won't allocate memory for small strings (well, most implementations are going that way but not all are quite there).
The alternative is to create a class which can hold a big enough array and return an object that type:
struct buffer {
enum { maxsize = 16 };
ubyte buffer[maxsize];
};
If you want get more fancy and support bigger strings which would then just allocate memory you'll need to deal a bit more with constructors, destructors, etc. (or just use std::vector<ubyte> and get over it).
There are at least three ways you could reimplement the method to avoid a direct allocation with new.
The Good:
Use a std::vector (This will allocate heap memory):
std::vector<ubyte> Readstr()
{
std::vector<ubyte> Result;
for (uint16 i = 0; i < ByteSize; i++)
{
Result.push_back(Buffer[ByteIndex]);
ByteIndex++;
if (Buffer[ByteIndex - 1] == ubyte(0))
{
ByteIndex++;
break;
}
}
return Result;
}
The Bad:
Force the caller to provide an output buffer and possibly a size do avoid overflows (Does not directly allocate memory):
ubyte* Readstr(ubyte* outputBuffer, size_t maxCount)
{
for (uint16 i = 0; i < ByteSize; i++)
{
if (i == maxCount)
break;
outputBuffer[i] = Buffer[ByteIndex];
ByteIndex++;
if (Buffer[ByteIndex - 1] == ubyte(0))
{
ByteIndex++;
break;
}
}
return outputBuffer;
}
The Ugly:
Use an internal static array and return a reference to it:
ubyte* Readstr()
{
enum { MAX_SIZE = 2048 }; // Up to you to decide the max size...
static ubyte outputBuffer[MAX_SIZE];
for (uint16 i = 0; i < ByteSize; i++)
{
if (i == MAX_SIZE)
break;
outputBuffer[i] = Buffer[ByteIndex];
ByteIndex++;
if (Buffer[ByteIndex - 1] == ubyte(0))
{
ByteIndex++;
break;
}
}
return outputBuffer;
}
Be aware that this last option has several limitations, including possibility of data races in multithreaded application and inability to call it inside a recursive function, among other subtle issues. But otherwise, is probably the closest to what you are looking for and can be used safely if you take some precautions and make some assumptions about the calling code.

Creating the biggest array of chars which can be allocated

I have tried to check programmatically how big an array I can allocate but my code does not seem to check it. How to make it faster? In the end I would like to get an exception.
#include "stdafx.h"
#include "iostream"
using namespace std;
int ASCENDING = 1, DESCENDING = 2;
int tworzTablice(int rozmiar, char* t){
try{
t = new char[rozmiar];
delete []t;
}catch (std::bad_alloc& e){
tworzTablice(rozmiar - 1,t);
return -1;
}
return rozmiar;
}
int f(long p, long skok){
char* t;
try{
while(true){
t = new char[p];
delete []t;
p = p + skok;
}
}
catch (std::bad_alloc& ba){
p = tworzTablice(p-1, t);
cout<<"blad";
}
return p;
}
int main(){
cout<<f(0, 100000000)<<endl;;
cout<<"koniec"<<endl;
system("pause");
return 0;
}
As I noted, there is a way to query the OS in order to determine the maximal size of heap-allocated memory, but I can't for the heck of it remember its name.
However, you can easily find out yourself. However, you should use malloc/free instead of new/delete in order to avoid the unnecessary initialisation of all cells;
#include <cstdlib>
#include <cstdio>
size_t maxMem() {
static size_t size = 0;
if (!size) {
size_t m = 0;
for (void* p = 0; (p = malloc(1<<m)); m++)
free(p);
while (m) {
size_t const testSize = size + (1<<(--m));
if (void* const p = malloc(testSize)) {
size = testSize;
free(p);
}
}
}
return size;
}
int main() {
// forgive me for using printf, but I couldn't remember how to hex-format in std::cout
printf("%u (hex %X)\n",int(maxMem()),int(maxMem()));
}
On my 64 bit machine I get
2147483647 (hex 7FFFFFFF)
while on another 32 system I get
2140700660 (hex 7F987FF4)
You can then go ahead and new an array of that size if you really have to. Note however, that this is the largest consecutive chunk you can request. The total memory your process might allocate is larger and depends on the installed RAM and the reserved swap space.
Allocating all available memory is probably a bad idea, but if you really want to:
vector<char*> ptrs;
int avail;
try {
while (true)
ptrs.push_back(new char[1000]);
}
catch (bad_alloc& b)
{
avail = ptrs.size() * 1000;
for (int i = 0; i < ptrs.size(); i++)
delete[] ptrs[i];
}