C++ program running out of memory for large data

C++ program running out of memory for large data - c++

I am trying to solve an issue in a C++ program I wrote. I am basically running out of memory. The program is a cache simulator. There is a file which has memory addresses collected beforehand, like this:
Thread Address Type Size Instruction Pointer
0 0x7fff60000000 1 8 0x7f058c482af3
There can be 100-500 billion such entries. First, I am trying to read all those entries and store it in a vector. Also while reading, I build up a set of these addresses (using map), and store the sequence numbers of a particular address. Sequence number simply means the position of the address-entry in the file (one address can be seen multiple times). For large inputs the program fails while doing this, with a bad_alloc error at around the 30 millionth entry. I guess I am running out of memory. Please advise on how can I circumvent the problem. Is there an alternative way to handle this kind of large data. Thank you very much! Sorry for the long post. I wanted to give some context and the actual code which I am writing.
Below is the relevant code. The ParseTaceFile() reads each line and calls the
StoreTokens(), which gets the address and size, and calls AddAddress() which actually stores the address in a vector and a map. The class declaration is also given below. The first try block in AddAddress() actually throws the bad_alloc exception.
void AddressList::ParseTraceFile(const char* filename) {
std::ifstream in_file;
std::cerr << "Reading Address Trace File..." << std::endl;
in_file.exceptions(std::ifstream::failbit | std::ifstream::badbit);
char *contents = NULL;
try {
in_file.open(filename, std::ifstream::in | std::ifstream::binary);
in_file.seekg(0, std::ifstream::end);
std::streampos length(in_file.tellg());
if (length < 0) {
std::cerr << "Can not read input file length" << std::endl;
throw ExitException(1);
}
contents = (new char[length]);
in_file.seekg(0, std::ifstream::beg);
in_file.read(contents, length);
in_file.close();
uint64_t linecount = 0, i = 0, lastline = 0, startline = 0;
while (i < static_cast<uint64_t>(length)) {
if ((contents[i] == '\n') or (contents[i] == EOF)) {
contents[i] = '\0';
lastline = startline;
startline = i + 1;
++linecount;
if (linecount > 1) {
StoreTokens((contents + lastline), &linecount);
}
}
++i;
}
} catch (std::bad_alloc& e) {
delete [] contents;
std::cerr << "error allocating memory while parsing" << std::endl;
throw;
} catch (std::ifstream::failure &exc1) {
if (!in_file.eof()) {
delete[] contents;
std::cerr << "error in reading address trace file" << exc1.what()
<< std::endl;
throw ExitException(1);
}
}
std::cerr << "Done" << std::endl;
}
//=========================================================
void AddressList::StoreTokens(char* line, uint64_t * const linecount) {
uint64_t address, size;
char *token = strtok(line, " \t");
uint8_t tokencount = 0;
while (NULL != token) {
++tokencount;
switch (tokencount) {
case 1:
break;
case 2:
address = strtoul(token, NULL, 16);
break;
case 3:
break;
case 4:
size = strtoul(token, NULL, 0);
break;
case 5:
break;
default:
break;
}
token = strtok(NULL, " \t");
}
AddAddress(address, size);
}
//================================================================
void AddressList::AddAddress(const uint64_t& byteaddr, const uint64_t& size) {
//allocate memory for the address vector
try {
if ((sequence_no_ % kReserveCount) == 0) address_list_.reserve(kReserveCount);
} catch (std::bad_alloc& e) {
std::cerr
<< "error allocating memory for address trace vector, address count"
<< sequence_no_ << std::endl;
throw;
}
uint64_t offset = byteaddr & (CacheParam::Instance()->LineSize() - 1);
//lineaddress = byteaddr >> CacheParam::Instance()->BitsForLine();
// this try block is for allocating memory for the address set and the queue it holds
try {
// splitter
uint64_t templinesize = 0;
do {
Address temp_addr(byteaddr + templinesize);
address_list_.push_back(temp_addr);
address_set_[temp_addr.LineAddress()].push(sequence_no_++);
templinesize = templinesize + CacheParam::Instance()->LineSize();
} while (size + offset > templinesize);
} catch (std::bad_alloc& e) {
address_list_.pop_back();
std::cerr
<< "error allocating memory for address trace set, address count"
<< sequence_no_ << std::endl;
throw;
}
}
//======================================================
typedef std::queue<uint64_t> TimeStampQueue;
typedef std::map<uint64_t, TimeStampQueue> AddressSet;
class AddressList {
public:
AddressList(const char* tracefilename);
bool Simulate(uint64_t *hit_count, uint64_t* miss_count);
~AddressList();
private:
void AddAddress(const uint64_t& byteaddr, const uint64_t& size);
void ParseTraceFile(const char* filename);
void StoreTokens(char* line, uint64_t * const linecount);
std::vector<Address> address_list_;
AddressSet address_set_;
uint64_t sequence_no_;
CacheMemory cache_;
AddressList (const AddressList&);
AddressList& operator=(const AddressList&);
};
The output is like this:
Reading Cache Configuration File...
Cache parameters read...
Reading Address Trace File...
error allocating memory for address trace set, address count 30000000
error allocating memory while parsing

As it seems your datasets will be much larger then your memory you would have to write an on disk index. Probably easiest to import the whole thing into a database and let that build the indexes for you.

A map sorts its input while it is being populated, to optimize lookup times and to provide a sorted output. It sounds like you aren't using the lookup feature, so the optimal strategy is to sort the list using another method. Merge sorting is fantastic for sorting collections that don't fit into memory. Even if you are doing lookups, a binary search into a sorted file will be faster than a naive approach as long as each record is a fixed size.

Forgive me for stating what may be obvious, but the need to store and query large amounts of data in an efficient manner is the exact reason databases were invented. They have already solved all these issues in a better way than you or I could come up with in a reasonable amount of time. No need to reinvent the wheel.

Related

C++ - Sorting vector of structs with std::sort results in read access violation

I have a problem with the std::sort-method. In the following code I'm using the std::sort-method to sort a vector of structs (= Highscore). However, when I run this line a "read access violation" exception is thrown in the xmemory-file.
Here are the details:
Exception thrown: read access violation.
_Pnext was 0x217AE3EE9D8. occurred
This is the method where the error occures.
void HighscoreManager::sortAndChangeRanks(bool deleteLast) {
std::sort(_highscores.begin(), _highscores.end());
if (deleteLast && _highscores.size() > MaxHighscores) {
_highscores.pop_back();
}
for (int i = 0; i < _highscores.size(); i++) {
_highscores.at(i).rank = i + 1;
}
}
_highscores is defined as std::vector<Highscore> _highscores; and is filled with values from a file before the method call. This works just fine. When im debugging right before using the sort-Method, the vector is filled with the right values from the file.
This is the implementation of the Highscore-struct:
struct Highscore {
int rank;
std::string name;
int points;
Highscore() {}
Highscore(int r, std::string n, int p) : rank(r), name(std::move(n)), points(p) {}
bool operator<(const Highscore& h1) const {
return points < h1.points;
}
};
Please help me or point me to a direction where the error could lie, I'm out of ideas.
EDIT
Since it was asked in the comments where the vector is used before the call to std::sort, this is the method which is called from the object constructor and the only time the vector is used before the sorting. This way of reading (writing works similarly) from a binary file is based on this.
bool HighscoreManager::loadFromFile() {
std::ifstream in(FileName, std::ios::in | std::ios::binary);
if(!in) {
return false;
}
try {
std::vector<Highscore>::size_type size = 0;
in.read((char*)&size, sizeof(size));
_highscores.resize(size);
in.read((char*)&_highscores[0], _highscores.size() * sizeof(Highscore));
} catch(const std::exception& e) {
std::cout << e.what() << std::endl;
}
in.close();
sortAndChangeRanks(false);
return in.good();
}

I don’t know what’s “optimized” about your high score storage. It seems like just a waste of effort for nothing. You’re not storing millions of high scores. You could just store them as text. The “optimization” can’t be measured in normal use. And if you think you’re optimizing: show measurements. Otherwise you’re fooling yourself and wasting time.
On top of it, you’ve complicated the code enough to that you ran into a problem that took a long time to debug. That’s a learning experience, but strictly speaking you wasted even more time because of it. Your time costs more than runtime, in most cases.
All you needed was trivial text stream I/O that can be done in two minutes. Messing about with binary storage is not advised if you don’t thoroughly understand what’s going on. As it stands, your code will crash or worse if you try reading the high scores written on a machine with different endianness. And now you got to manage endianness of all the numeric data… good luck.
In any case, it’s actually a pessimization, since you constantly reallocate the temporary string buffer. That buffer is not needed. You should resize the string itself an put the data in it.
std::string name(nLen);
in.read(&name[0], name.size());

Here is the current solution I'm using. This works for me and solves my problem, which is with reading/writing an std::string to a binary file and not with the sorting method (Thanks to the comments on the question!). To fix this problem I used parts of this.
reading from a file:
std::ifstream in(FileName, std::ios::in | std::ios::binary);
if(!in) {
return false;
}
try {
std::vector<Highscore>::size_type size = 0;
in.read((char*)&size, sizeof(size));
for(int i = 0; i < size; i++) {
int r, p;
size_t nLen;
in.read((char*)&r, sizeof(int));
in.read((char*)&p, sizeof(int));
in.read((char*)&nLen, sizeof(size_t));
char* temp = new char[nLen + 1];
in.read(temp, nLen);
temp[nLen] = '\0';
std::string name = temp;
delete[] temp;
_highscores.emplace_back(r, name, p);
}
} catch(const std::exception& e) {
std::cout << e.what() << std::endl;
}
in.close();
sortAndChangeRanks(false);
return in.good();
}
writing to a file:
bool HighscoreManager::saveToFile() {
std::ofstream out(FileName, std::ios::out | std::ios::binary);
if(!out) {
return false;
}
std::vector<Highscore>::size_type size = _highscores.size();
try {
out.write((char*)&size, sizeof(size));
for(int i = 0; i < size; i++) {
out.write((char*)&_highscores.at(i).rank, sizeof(int));
out.write((char*)&_highscores.at(i).points, sizeof(int));
size_t nameLen = _highscores.at(i).name.size();
out.write((char*)&nameLen, sizeof(size_t));
out.write((char*)_highscores.at(i).name.c_str(), nameLen);
}
} catch (const std::exception& e) {
std::cout << e.what() << std::endl;
}
out.close();
return out.good();
}
Thank you all for your help!

Can't find an array of bytes match from another process' memory

I'm currently trying to verify whether or not a code I wrote using this as reference works. I've managed to get it to run without crashing, but once I began checking whether or not the code actually does what I want it to do, I ran into a problem.
While I think that the code goes through all the memory regions belonging to the process I'm trying to search, it probably isn't doing that?
I'm honestly not sure where the problem lies here, am I searching for my array of bytes in the memory buffer incorrectly, or am I actually reading the wrong memory?
When checking whether or not my code finds an array of bytes match from the process, I used cheat engine to scan beforehand and then compare the results with what my program returned. I used an array of bytes I knew from beforehand will always exist at least once in the program I'm scanning.
Now comparing the results I got from cheat engine:
And the ones I got from my program, which is 0 results. That doesn't seem quite right.
I open the process which I want to read the memory from with the following flags:
PROCESS_VM_READ | PROCESS_QUERY_INFORMATION
And the way I call my pattern matching function:
testy.patternMatch("000000000000000001000000000000001CC80600");
As for my current code:
The function I'm calling (m_localprocess is an open handle to the process I got beforehand)
void process_wrapper::patternMatch(std::string pattern)
{
MEMORY_BASIC_INFORMATION sys_info;
std::vector<char> pattern_conv(pattern.begin(), pattern.end());
for (unsigned char * pointer = NULL;
VirtualQueryEx(m_localprocess, pointer, &sys_info, sizeof(sys_info)) == sizeof(sys_info);
pointer += sys_info.RegionSize) {
std::vector<char> mem_buffer;
if (sys_info.State == MEM_COMMIT && (sys_info.Type == MEM_MAPPED || sys_info.Type == MEM_PRIVATE)) {
SIZE_T bytes_read;
mem_buffer.resize(sys_info.RegionSize);
ReadProcessMemory(m_localprocess, pointer, &mem_buffer[0], sys_info.RegionSize, &bytes_read);
if (GetLastError() != 0) {
std::cout << "Error: " << GetLastError();
SetLastError(0);
}
mem_buffer.resize(bytes_read);
std::cout << "\nSize: "<< mem_buffer.size() << "\n";
if (mem_buffer.size() != 0) {
find_all(mem_buffer.begin(), mem_buffer.end(), pattern_conv.begin(), pattern_conv.end(), pointer);
}
}
}
std::cout << "Results: " << results.size() << "\n";
for (void* x : results) {
std::cout << x << "\n";
}
}
And this function calls find_all function, which looks like this:
void find_all(std::vector<char>::iterator beg, std::vector<char>::iterator end,
std::vector<char>::iterator beg_pattern, std::vector<char>::iterator end_pattern,
const unsigned char * baseAddr) {
std::vector<char>::iterator walk = beg;
while (walk != end) {
walk = std::search(walk, end, beg_pattern, end_pattern);
if (walk != end) {
std::cout << (void*)(baseAddr + (walk - beg)) << "\n";
results.emplace_back((void*)(baseAddr + (walk - beg)));
++walk;
}
}
}
Any suggestions on other ways of implementing what I'm trying to do are more than welcome.

Thanks to the comment left by Jonathan pointing out I was actually comparing ASCII values instead of actual hex values.
The code works now:
The change I did to my code (got it from here):
void process_wrapper::patternMatch(std::string patternOrig)
{
MEMORY_BASIC_INFORMATION sys_info;
int len = patternOrig.length();
std::string pattern;
for (int i = 0; i < len; i += 2)
{
std::string byte = patternOrig.substr(i, 2);
char chr = (char)(int)strtol(byte.c_str(), NULL, 16);
pattern.push_back(chr);
}
//...

C++ Spell checking program with two classes; Dictionary and word

Here is the specification for the code:
You are to use the Word and Dictionary classes defined below and write all member functions and any necessary supporting functions to achieve the specified result.
The Word class should dynamically allocate memory for each word to be stored in the dictionary.
The Dictionary class should contain an array of pointers to Word. Memory for this array must be dynamically allocated. You will have to read the words in from the file. Since you do not know the "word" file size, you do not know how large to allocate the array of pointers. You are to let this grow dynamically as you read the file in. Start with an array size of 8, When that array is filled, double the array size, copy the original 8 words to the new array and continue.
You can assume the "word" file is sorted, so your Dictionary::find() function must contain a binary search algorithm. You might want to save this requirement for later - until you get the rest of your program running.
Make sure you store words in the dictionary as lower case and that you convert the input text to the same case - that way your Dictionary::find() function will successfully find "Four" even though it is stored as "four" in your Dictionary.
Here is my code so far.
#include <cstring>
#include <iostream>
#include <fstream>
using namespace std;
class Word
{
char* word_;
public:
Word(const char* text = 0);
~Word() { delete[] word_; word_ = nullptr; }
const char* word() const;
};
Word::Word(const char* arg)
: word_(new char[strlen(arg) + 1])
{
strcpy(word_, arg);
}
const char* Word::word() const
{
return word_;
}
class Dictionary
{
Word** words_;
unsigned int capacity_; // max number of words Dictionary can hold
unsigned int numberOfWordsInDictionary_;
void resize() {
capacity_ = capacity_ * 2;
cout << "Size = " << capacity_ << endl;
};
void addWordToDictionary(char* word) { words_ += *word; };
public:
Dictionary(const char* filename);
~Dictionary() {
delete[] words_; words_ = nullptr;
};
bool find(const char* word);
};
Dictionary::Dictionary(const char * filename)
: words_(new Word*[8]), capacity_(8), numberOfWordsInDictionary_(0)
{
ifstream fin(filename);
if (!filename) {
cout << "Failed to open file!" << endl;
}
char buffer[32];
while (fin.getline(buffer, sizeof(buffer)))
{
if (numberOfWordsInDictionary_ == capacity_)
{
resize();
}
addWordToDictionary(buffer);
}
}
bool Dictionary::find(const char * left)
{
int last = capacity_ - 1,
first = 0,
middle;
bool found = false;
while (!found && first <= last) {
middle = (first + last) / 2;
if (strcmp(left, reinterpret_cast<char*>(words_[middle])) == 0) {
found = true;
}
else if (left > reinterpret_cast<char*>(words_[middle]))
last = middle - 1;
else
first = middle + 1;
}
return found;
}
;
bool cleanupWord(char x[] ) {
bool lower = false;
int i = 0;
while (x[i]) {
char c = x[i];
putchar(tolower(c));
lower = true;
}
return lower;
}
int main()
{
char buffer[32];
Dictionary Websters("words.txt");
ifstream fin("gettysburg.txt");
cout << "\nSpell checking " << "gettysburg.text" << "\n\n";
while (fin >> buffer) {
if (cleanupWord(buffer) == true) {
if (!Websters.find(buffer)) {
cout << buffer << " not found in the Dictionary\n";
}
}
}
system("PAUSE");
}
When I run the program it stops after outputting "spellchecking Gettysburg.txt" and I don't know why. Thank you!

The most likely cause of this problem is the text files have not been opened. Add a check with is_open to make sure they have been opened.
When using Relative Paths (any path that does not go all the way back to the root of the file system (and is an Absolute Path)), take care that the program is being run from the directory you believe it to be. It is not always the same directory as the executable. Search Term to use to learn more about this: Working Directory.
Now on to other reasons this program will not work:
void addWordToDictionary(char* word) { words_ += *word; };
is not adding words to the dictionary. Instead it is advancing the address at which words_ points by the numeric value of the letter at *word. This is extremely destructive as it loses the pointer to the buffer allocated for words_ in the constructor making delete[] words_; in the Dictionary destructor ineffective and probably fatal.
Instead you want to (Note I use want to with a bit of trepidation. What you really want to do is use std::vector and std::string, but I strongly suspect this would upset the assignment's marker)
Dynamically allocate a new Word with new.
Place this word in a free spot in the words_ array. Something along the lines of words_[numberOfWordsInDictionary_] = myNewWord;
Increase numberOfWordsInDictionary_ by 1.
Note that the Words allocated with new must all be released in the Dictionary destructor. You will want a for loop to help with this.
In addition, I would move the
if (numberOfWordsInDictionary_ == capacity_)
{
resize();
}
from Dictionary to addWordToDictionary so that any time addWordToDictionary is called it is properly sized.
Hmmm. While we're at it, let's look at resize
void resize() {
capacity_ = capacity_ * 2;
cout << "Size = " << capacity_ << endl;
};
This increases the object's capacity_ but does nothing to allocate more storage for words_. This needs to be corrected. You must:
Double the value of capacity_. You already have this.
Allocate a larger buffer to hold the replacement of words_ with new.
Copy all of the Words in words_ to the larger buffer.
Free the buffer currently pointed to by words_
Point words_ at the new, larger buffer.
Addendum
I haven't looked closely at find because the carnage required to fix the reading and storage of the dictionary will most likely render find unusable even if it does currently work. The use of reinterpret_cast<char*> is an alarm bell, though. There should be no reason for a cast, let alone the most permissive of them all, in a find function. Rule of thumb: When you see a reinterpret_cast and you don't know what it's for, assume it's hiding a bug and approach it with caution and suspicion.
In addition to investigating the Rule of Three mentioned in the comments, look into the Rule of Five. This will allow you to make a much simpler, and probably more efficient, dictionary based around Word* words_, where words_ will point to an array of Word directly instead of pointers to Words.

Implementing cdbpp library for string values

I am trying to implement the cdbpp library from chokkan. I am facing some problems when I was trying to implement the same for values with data type of strings.
The original code and documentation can be found here:
http://www.chokkan.org/software/cdbpp/ and the git source code is here: https://github.com/chokkan/cdbpp
This is what I have so far:
In the sample.cpp (from where i am calling the main function), I modified the build() function:
bool build()
{
// Open a database file for writing (with binary mode).
std::ofstream ofs(DBNAME, std::ios_base::binary);
if (ofs.fail()) {
std::cerr << "ERROR: Failed to open a database file." << std::endl;
return false;
}
try {
// Create an instance of CDB++ writer.
cdbpp::builder dbw(ofs);
// Insert key/value pairs to the CDB++ writer.
for (int i = 1;i < N;++i) {
std::string key = int2str(i);
const char* val = "foobar"; //string value here
dbw.put(key.c_str(), key.length(), &val, sizeof(i));
}
} catch (const cdbpp::builder_exception& e) {
// Abort if something went wrong...
std::cerr << "ERROR: " << e.what() << std::endl;
return false;
}
return true;
}
and in cdbpp.h file, i modified the put() function as :
void put(const key_t *key, size_t ksize, const value_t *value, size_t vsize)
{
// Write out the current record.
std::string temp2 = *value;
const char* temp = temp2.c_str();
write_uint32((uint32_t)ksize);
m_os.write(reinterpret_cast<const char *>(key), ksize);
write_uint32((uint32_t)vsize);
m_os.write(reinterpret_cast<const char *>(temp), vsize);
// Compute the hash value and choose a hash table.
uint32_t hv = hash_function()(static_cast<const void *>(key), ksize);
hashtable& ht = m_ht[hv % NUM_TABLES];
// Store the hash value and offset to the hash table.
ht.push_back(bucket(hv, m_cur));
// Increment the current position.
m_cur += sizeof(uint32_t) + ksize + sizeof(uint32_t) + vsize;
}
Now the I get the correct value if the string is less than or equal to 3 characters(eg: foo will return foo). If it is greater than 3 it gives me the correct string up to 3 characters then garbage value(eg. foobar gives me foo�`)
I am a little new to c++ and I would appreciate any help you could give me.

(moving possible answer in comment to real answer)
vsize as passed into put is the size of an integer when it should be the length of the value string.

SIGSEGV when dynamically allocating memory to receive FTP server's LIST response

I am building an FTP client in C++ for personal use and for the learning experience, but I have run into a problem when allocating memory for storing LIST responses. The library I am using for FTP requests is libcurl which will call the following function when it receives a response from the server:
size_t FTP_getList( char *ptr, size_t size, size_t nmemb, void *userdata) {
//GLOBAL_FRAGMENT is global
//libcurl will split the resulting list into smaller approx 2000 character
//strings to pass into this function so I compensate by storing the leftover
//fragment in a global variable.
size_t fraglen = 0;
if(GLOBAL_FRAGMENT!=NULL) {
fraglen = strlen(GLOBAL_FRAGMENT);
}
size_t listlen = size*nmemb+fraglen+1;
std::cout<<"Size="<<size<<" nmemb="<<nmemb;
char *list = new char[listlen];
if(GLOBAL_FRAGMENT!=NULL) {
snprintf(list,listlen,"%s%s",GLOBAL_FRAGMENT,ptr);
} else {
strncpy(list,ptr,listlen);
}
list[listlen]=0;
size_t packetSize = strlen(list);
std::cout<<list;
bool isComplete = false;
//Check to see if the last line is complete (i.e. newline terminated)
if(list[size]=='\n') {
isComplete = true;
}
if(GLOBAL_FRAGMENT!=NULL) {
delete[] GLOBAL_FRAGMENT;
}
GLOBAL_FRAGMENT = GLOBAL_FTP->listParse(list,isComplete);
delete[] list;
//We return the length of the new string to prove to libcurl we
//our function properly executed
return size*nmemb;
}
The function above calls the next function to split each line returned into individual
strings to be further processed:
char* FTP::listParse(char* list, bool isComplete) {
//std::cout << list;
//We split the list into seperate lines to deal with independently
char* line = strtok(list,"\n");
int count = 0;
while(line!=NULL) {
count++;
line = strtok(NULL,"\n");
}
//std::cout << "List Count: " << count << "\n";
int curPosition = 0;
for(int i = 0; i < count-1 ; i++) {
//std::cout << "Iteration: " << i << "\n";
curPosition = curPosition + lineParse((char*)&(list[curPosition])) + 1;
}
if(isComplete) {
lineParse((char*)&(list[curPosition]));
return NULL;
} else {
int fraglen = strlen((char*)&(list[curPosition]));
char* frag = new char[fraglen+1];
strcpy(frag,(char*)&(list[curPosition]));
frag[fraglen] = 0;
return frag;
}
}
The function above then calls the function below to split the individual entries in a line into separate tokens:
int FTP::lineParse(char *line) {
int result = strlen(line);
char* value = strtok(line, " ");
while(value!=NULL) {
//std::cout << value << "\n";
value = strtok(NULL, " ");
}
return result;
}
This program works for relatively small list responses but when I tried stress testing it by getting a listing for a remote directory with ~10,000 files in it, my program threw a SIGSEGV... I used backtrace in gdb and found that the segfault happens on lines delete[] GLOBAL_FRAGMENT;' anddelete[] list;inFTP_getList. Am I not properly deleting these arrays? I am callingdelete[]` exactly once for each time I allocate them so I don't see why it wouldn't be allocating memory correctly...
On a side note: Is it necessary to check to see if an array is NULL before you try to delete it?
Also, I know this would be easier to do with STD::Strings but I am trying to learn c style strings as practice, and the fact that it is crashing is a perfect example of why I need practice, I will also be changing the code to store these in a dynamically allocated buffer that only is reallocated when the new ptr size is larger than the previous length, but I want to figure out why the current code isn't working first. :-) Any help would be appreciated.

In this code
size_t listlen = size*nmemb+fraglen+1;
std::cout<<"Size="<<size<<" nmemb="<<nmemb;
char *list = new char[listlen];
if(GLOBAL_FRAGMENT!=NULL) {
snprintf(list,listlen,"%s%s",GLOBAL_FRAGMENT,ptr);
} else {
strncpy(list,ptr,listlen);
}
list[listlen]=0;
You are overruning your list buffer. You have allocated listlen bytes, but you write a 0 value one past the last allocated byte. This invokes undefined behavior. More practically speaking, it can cause heap corruption, which can cause the kind of errors you observed.
I didn't see any issues with the way you are calling delete[].
It is perfectly safe to delete a NULL pointer.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ program running out of memory for large data - c++

As it seems your datasets will be much larger then your memory you would have to write an on disk index. Probably easiest to import the whole thing into a database and let that build the indexes for you.

Related

C++ - Sorting vector of structs with std::sort results in read access violation

Can't find an array of bytes match from another process' memory

C++ Spell checking program with two classes; Dictionary and word

Implementing cdbpp library for string values

SIGSEGV when dynamically allocating memory to receive FTP server's LIST response

Categories

Resources