I have to write a small console program for a developer internship interview and something big and very hard to find is going wrong. I'm supposed to write a program that checks a directory full of binary .dat files for duplicate files.
What I did:
I input a file using stdin from main.cpp and if the directory exists I pass the path on to my fileChecker function which then generates MD5 hashes for all the files in the given directory and then creates a QHash with the file names as key and the hashes as values. I then try to iterate over the QHash using a java-style iterator. When I run the program it crashes completely and I have to choose debug or end program which makes it impossible for me to figure out what's going wrong as QT's debugger doesn't output anything.
My guess is that something is going wrong with my getDuplicates function in fileChecker.cpp as i've never used java-style itterators before to itterate over a QHash. i'm trying to take the first key-value pair and store it in two variables. Then I remove those values from the QHash and try to itterate over the remainder of the QHash using an itterator inside the previous itterator. If anyone has any idea what i'm doing wrong please let me know asap as I need to have this done before monday to get an interview... the code for fileChecker.h and fileChecker.cpp are below please let me know if there's anything more I can add.
Thanks
my code:
main.cpp:
#include "filechecker.h"
#include <QDir>
#include <QTextStream>
#include <QString>
#include <QStringList>
QTextStream in(stdin);
QTextStream out(stdout);
int main() {
QDir* dir;
FileChecker checker;
QString dirPath;
QStringList duplicateList;
out << "Please enter directory path NOTE: use / as directory separator regardless of operating system" << endl;
dirPath = in.readLine();
dir->setPath(dirPath);
if(dir->exists()) {
checker.processDirectory(dir);
duplicateList = checker.getDuplicateList();
}
else if(!(dir->exists()))
out << "Directory does not exist" << endl;
foreach(QString str, duplicateList){
out << str << endl;
}
return 0;
}
fileChecker.h:
#ifndef FILECHECKER_H
#define FILECHECKER_H
#include <QString>
#include <QByteArray>
#include <QHash>
#include <QCryptographicHash>
#include <QStringList>
#include <QDir>
class FileChecker
{
public:
FileChecker();
void processDirectory(QDir* dir);
QByteArray generateChecksum(QFile* file);
QStringList getDuplicateList();
private:
QByteArray generateChecksum(QString fileName);
QHash<QString, QByteArray> m_hash;
};
#endif // FILECHECKER_H
fileChecker.cpp:
#include "filechecker.h"
FileChecker::FileChecker() {
}
void FileChecker::processDirectory(QDir* dir) {
dir->setFilter(QDir::Files);
QStringList fileList = dir->entryList();
for (int i = 0; i < fileList.size(); i++) {
bool possibleDuplicatesFound = false;
QString testName = fileList.at((i));
QFile* testFile;
testFile->setFileName(testName);
foreach(QString s, fileList) {
QFile* possibleDuplicate;
possibleDuplicate->setFileName(s);
if(testFile->size() == possibleDuplicate->size() && testFile->fileName() != possibleDuplicate->fileName()) {
QByteArray md5HashPd = generateChecksum(possibleDuplicate);
m_hash.insert(possibleDuplicate->fileName(), md5HashPd);
possibleDuplicatesFound = true;
fileList.replaceInStrings(possibleDuplicate->fileName(), "");
}
QByteArray md5Hasht = generateChecksum(testFile);
fileList.replaceInStrings(testFile->fileName(), "");
possibleDuplicatesFound = false;
}
}
}
QByteArray FileChecker::generateChecksum(QFile* file) {
if(file->open(QIODevice::ReadOnly)) {
QCryptographicHash cHash(QCryptographicHash::Md5);
cHash.addData(file->readAll());
QByteArray checksum = cHash.result();
return checksum;
}
}
QStringList FileChecker::getDuplicateList() {
QStringList tempList;
QString tempStr;
QString currentKey;
QByteArray currentValue;
QMutableHashIterator<QString, QByteArray> i(m_hash);
do {
while (i.hasNext()){
i.next();
currentKey = i.key();
currentValue = i.value();
tempStr.append("%1 ").arg(currentKey);
if (i.value() == currentValue) {
tempStr.append("and %1").arg(i.key());
i.remove();
}
tempList.append(tempStr);
tempStr.clear();
}
} while (m_hash.size() > 0);
return tempList;
}
Aside from your sad Qt memory management problem, you really don't have to calculate md5 sums of all files.
Just for groups of files of equal size :)
Files with a unique size can be left out. I wouldn't even call this an optimization but simply not doing a potentially absurd amount of unnecessary extra work :)
All Qt Java-style iterators come in "regular" (const) and mutable versions (where it is safe to modify the object you are iterating). See QMutableHashIterator. You're modifying a const iterator; thus, it crashes.
While you're at it, look at the findNext function the iterator provides. Using this function eliminates the need for your second iterator.
Just add i.next() as following.
do {
while (i.hasNext()) {
i.next();
currentKey = i.key();
currentValue = i.value();
tempStr.append(currentKey);
m_hash.remove(currentKey);
QHashIterator<QString, QByteArray> j(m_hash);
while (j.hasNext()) {
if (j.value() == currentValue) {
tempStr.append(" and %1").arg(j.key());
m_hash.remove(j.key());
}
}
tempList.append(tempStr);
tempStr.clear();
}
} while (m_hash.size() > 1);
Some things that stand out:
It's a bad idea to readAll the file: it will allocate a file-sized block on the heap, only to calculate its hash and discard it. That's very wasteful. Instead, leverage the QCryptographicHash::addData(QIODevice*): it will stream the data from the file, only keeping a small chunk in memory at any given time.
You're explicitly keeping an extra copy of the entry list of a folder. This is likely unnecessary. Internally, the QDirIterator will use platform-specific ways of iterating a directory, without obtaining a copy of the entry list. Only the OS has the full list, the iterator only iterates it. You still need to hold the size,path->hash map of course.
You're using Java iterators. These are quite verbose. The C++ standard-style iterators are supported by many containers, so you could easily substitute other containers from e.g. C++ standard library or boost to tweak performance/memory use.
You're not doing enough error checking.
The code seems overly verbose for the little it's actually doing. The encapsulation of everything into a class is probably also a Java habit, and rather unnecessary here.
Let's see what might be the most to-the-point, reasonably performant way of doing it. I'm skipping the UI niceties: you can either call it with no arguments to check in the current directory, or with arguments, the first of which will be used as the path to check in.
The auto & dupe = entries[size][hash.result()]; is a powerful expression. It will construct the potentially missing entries in the external and internal map.
// https://github.com/KubaO/stackoverflown/tree/master/questions/dupechecker-37557870
#include <QtCore>
#include <cstdio>
QTextStream out(stdout);
QTextStream err(stderr);
int check(const QString & path) {
int unique = 0;
// size hash path
QMap<qint64, QMap<QByteArray, QString>> entries;
QDirIterator it(path, QDirIterator::Subdirectories | QDirIterator::FollowSymlinks);
QCryptographicHash hash{QCryptographicHash::Sha256};
while (it.hasNext()) {
it.next();
auto const info = it.fileInfo();
if (info.isDir()) continue;
auto const path = info.absoluteFilePath();
auto const size = info.size();
if (size == 0) continue; // all zero-sized files are "duplicates" but let's ignore them
QFile file(path); // RAII class, no need to explicitly close
if (!file.open(QIODevice::ReadOnly)) {
err << "Can't open " << path << endl;
continue;
}
hash.reset();
hash.addData(&file);
if (file.error() != QFile::NoError) {
err << "Error reading " << path << endl;
continue;
}
auto & dupe = entries[size][hash.result()];
if (! dupe.isNull()) {
// duplicate
out << path << " is a duplicate of " << dupe << endl;
} else {
dupe = path;
++ unique;
}
}
return unique;
}
int main(int argc, char ** argv) {
QCoreApplication app{argc, argv};
QDir dir;
if (argc == 2)
dir = app.arguments().at(1);
auto unique = check(dir.absolutePath());
out << "Finished. There were " << unique << " unique files." << endl;
}
Related
I'm trying to figure out why using the merge operator for a large number of keys with rocksdb is very slow.
My program uses a simple associative merge operator (based on upstream StringAppendOperator) that concatenates values using a delimiter for a given key. It takes a very long time to merge all the keys and for the program to finish running.
PS: I built rocksdb from source - latest master. I'm not sure if I'm missing something very obvious.
Here's a minimally reproducible example with about 5 million keys - number of keys can be adjusted by changing the limit of the for loop. Thank you in advance!
#include <filesystem>
#include <iostream>
#include <utility>
#include <rocksdb/db.h>
#include "rocksdb/merge_operator.h"
// Based on: https://github.com/facebook/rocksdb/blob/main/utilities/merge_operators/string_append/stringappend.h#L13
class StringAppendOperator : public rocksdb::AssociativeMergeOperator
{
public:
// Constructor: specify delimiter
explicit StringAppendOperator(std::string delim) : delim_(std::move(delim)) {};
bool Merge(const rocksdb::Slice &key, const rocksdb::Slice *existing_value,
const rocksdb::Slice &value, std::string *new_value,
rocksdb::Logger *logger) const override;
static const char *kClassName() { return "StringAppendOperator"; }
static const char *kNickName() { return "stringappend"; }
[[nodiscard]] const char *Name() const override { return kClassName(); }
[[nodiscard]] const char *NickName() const override { return kNickName(); }
private:
std::string delim_;// The delimiter is inserted between elements
};
// Implementation for the merge operation (concatenates two strings)
bool StringAppendOperator::Merge(const rocksdb::Slice & /*key*/,
const rocksdb::Slice *existing_value,
const rocksdb::Slice &value, std::string *new_value,
rocksdb::Logger * /*logger*/) const
{
// Clear the *new_value for writing.
assert(new_value);
new_value->clear();
if (!existing_value)
{
// No existing_value. Set *new_value = value
new_value->assign(value.data(), value.size());
}
else
{
// Generic append (existing_value != null).
// Reserve *new_value to correct size, and apply concatenation.
new_value->reserve(existing_value->size() + delim_.size() + value.size());
new_value->assign(existing_value->data(), existing_value->size());
new_value->append(delim_);
new_value->append(value.data(), value.size());
std::cout << "Merging " << value.data() << "\n";
}
return true;
}
int main()
{
rocksdb::Options options;
options.create_if_missing = true;
options.merge_operator.reset(new StringAppendOperator(","));
# tried a variety of settings
options.max_background_compactions = 16;
options.max_background_flushes = 16;
options.max_background_jobs = 16;
options.max_subcompactions = 16;
rocksdb::DB *db{};
auto s = rocksdb::DB::Open(options, "/tmp/test", &db);
assert(s.ok());
rocksdb::WriteBatch wb;
for (uint64_t i = 0; i < 2500000; i++)
{
wb.Merge("a:b", std::to_string(i));
wb.Merge("c:d", std::to_string(i));
}
db->Write(rocksdb::WriteOptions(), &wb);
db->Flush(rocksdb::FlushOptions());
rocksdb::ReadOptions read_options;
rocksdb::Iterator *it = db->NewIterator(read_options);
for (it->SeekToFirst(); it->Valid(); it->Next())
{
std::cout << it->key().ToString() << " --> " << it->value().ToString() << "\n";
}
delete it;
delete db;
std::filesystem::remove_all("/tmp/test");
return 0;
}
Shared your question in the Speedb Hive, on Discord.
The reply is from Hilik, our o-founder and chief scientist.
'Merge operators are very useful to get a quick write response time, alas reads require reading the original and applying by order the merges . This operation may be very expensive esp with strings that needed to be copied and appended on each merge. The simplest way to resolve this is to use read modify write eventually . Doing this at the application level is possible but may be problematic (if two threads can do this operation concurrently) . We are thinking of ways to resolve this during the compaction and are willing to work with you on a PR...'
Hope this helps. Join the discord server to participate in this discussion and many other interesting and related topics.
Here is the link to the discussion about your topic
I'm making a program to help me understand the ins-and-outs of std::filesystem. However, when I went to build, I got an error (C2440) that I cannot convert type 'int' to '_Valty' when using directory_iterator in conjunction with directory_entry. It shows the error in the filesystem code, so I don't know where it's causing it in my code.
#include "Replacer.h"
#include<lmcons.h>
Replacer& Replacer::GetInstance()
{
static Replacer instance;
return instance;
}
void Replacer::Init()
{
std::string base_path = "C:/Users/";
// Get the username.
WCHAR username[UNLEN + 1];
DWORD username_len = UNLEN + 1;
GetUserName(username, &username_len);
// Add it to the string.
base_path.append((char*)username);
base_path.shrink_to_fit(); // Gotta make sure.
// Set the base bath.
begining_path = new fs::path(base_path);
// Set the current path to the begginging path.
current_path = new fs::path(begining_path); /// Hate that I have to use copy, but oh well.
return;
}
void Replacer::Search(UINT cycles)
{
// I have no interest in replacing folder names...
// Just file names and data.
for (UINT i = 0; i < cycles; ++i) // MAIN LOOP.
{
VisualUpdater(i);
SearchCurrentPath(current_path);
}
return;
}
void Replacer::Unload()
{
delete begining_path;
delete current_path;
begining_path = nullptr;
current_path = nullptr;
}
Replacer::Replacer()
: begining_path(), current_path()
{}
void Replacer::Replace(std::string& filename)
{
// We have found a file that we need to replace.
/// Of couse we have dumbass, we're here, aren't we?
// Open up the file...
std::ofstream out;
out.open(filename, std::ios::out);
out.clear(); // Clear the file.
out.write(THE_WORD, std::string(THE_WORD).size()); // Replace the data with the word.
// Replace the filename with the word.
fs::rename(std::string(current_path->string()).append('/' + filename), THE_WORD);
return;
}
void Replacer::ChangeDirectory(fs::directory_entry &iter)
{
*current_path = iter.path(); // Change the current path to the next path.
SearchCurrentPath(current_path); // This is where the recursion begins.
}
void Replacer::VisualUpdater(UINT cycles)
{
std::cout << "\nCycle #: " << cycles;
std::cout << "\nCurrent path: " << current_path->string();
std::cout << "\nBase path: " << begining_path->string();
std::cout << "\n" << NUM_CYCLES - cycles << " cycles left." << std::endl;
}
void Replacer::SearchCurrentPath(fs::path *curr)
{
for (auto& i : fs::directory_iterator(curr->string()))
{
if (i.path().empty())
continue; // This *does* come in handy.
if (fs::is_regular_file(i)) // We have to check if it is a regular file so we can change the
{ // name and the data.
std::string temp(i.path().filename().string());
Replace(temp);
}
else
{
// Here is where we move up a directory.
fs::directory_entry entry = i;
ChangeDirectory(entry);
}
}
}
If I had to take a guess, I would assume it's in the last function written above, but I'm not entirely sure. Anyone have any ideas on how I would go about fixing this?
So, in the end I figured it out. For anyone curious, it wasn't in the bottom functions. It was the place where I was using a copy constructor up in the Init function. I guess filesystem doesn't like that.
I'm creating a native node extension for RocksDB, I've pinned down an issue which I can not explain. So I have the following perfectly functioning piece of code:
std::string v;
ROCKSDB_STATUS_THROWS(db->Get(*options, k, &v));
napi_value result;
NAPI_STATUS_THROWS(napi_create_buffer_copy(env, v.size(), v.c_str(), nullptr, &result));
return result;
But when I introduce an optimization that reduces one extra memcpy I get segfaults:
std::string *v = new std::string();
ROCKSDB_STATUS_THROWS(db->Get(*options, k, v)); // <============= I get segfaults here
napi_value result;
NAPI_STATUS_THROWS(napi_create_external_buffer(env, v->size(), (void *)v->c_str(), rocksdb_get_finalize, v, &result));
return result;
Here's Get method signature:
rocksdb::Status rocksdb::DB::Get(const rocksdb::ReadOptions &options, const rocksdb::Slice &key, std::string *value)
Any thoughts why does this issue might happen?
Thank you in advance!
Edit
Just to be sure, I've also checked the following version (it also fails):
std::string *v = new std::string();
ROCKSDB_STATUS_THROWS(db->Get(*options, k, v));
napi_value result;
NAPI_STATUS_THROWS(napi_create_buffer_copy(env, v->size(), v->c_str(), nullptr, &result));
delete v;
Edit
As per request in comments providing more complete example:
#include <napi-macros.h>
#include <node_api.h>
#include <rocksdb/db.h>
#include <rocksdb/convenience.h>
#include <rocksdb/write_batch.h>
#include <rocksdb/cache.h>
#include <rocksdb/filter_policy.h>
#include <rocksdb/cache.h>
#include <rocksdb/comparator.h>
#include <rocksdb/env.h>
#include <rocksdb/options.h>
#include <rocksdb/table.h>
#include "easylogging++.h"
INITIALIZE_EASYLOGGINGPP
...
/**
* Runs when a rocksdb_get return value instance is garbage collected.
*/
static void rocksdb_get_finalize(napi_env env, void *data, void *hint)
{
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get_finalize (started)";
if (hint)
{
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get_finalize (finished)";
delete (std::string *)hint;
}
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get_finalize (finished)";
}
/**
* Gets key / value pair from a database.
*/
NAPI_METHOD(rocksdb_get)
{
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (started)";
NAPI_ARGV(3);
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (getting db argument)";
rocksdb::DB *DECLARE_FROM_EXTERNAL_ARGUMENT(0, db);
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (getting k argument)";
DECLARE_SLICE_FROM_BUFFER_ARGUMENT(1, k);
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (getting options argument)";
rocksdb::ReadOptions *DECLARE_FROM_EXTERNAL_ARGUMENT(2, options);
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (declaring v variable)";
std::string *v = new std::string();
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (getting value from database)";
ROCKSDB_STATUS_THROWS(db->Get(*options, k, v));
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (wrapping value with js wrapper)";
napi_value result;
NAPI_STATUS_THROWS(napi_create_external_buffer(env, v->size(), (void *)v->c_str(), rocksdb_get_finalize, v, &result));
LOG_IF(logging_enabled, INFO) << LOCATION << " rocksdb_get (finished)";
return result;
}
The code that launches the above method is implemented in TypeScript and runs in NodeJS, here is complete listing:
import path from 'path';
import { bindings as rocks, Unique, BatchContext } from 'rocksdb';
import { MapOf } from '../types';
import { Command, CommandOptions, CommandOptionDeclaration, Persist, CommandEnvironment } from '../command';
// tslint:disable-next-line: no-empty-interface
export interface PullCommandOptions {
}
#Command
export class ExampleCommandNameCommand implements Command {
public get description(): string {
return "[An example command description]";
}
public get options(): CommandOptions<CommandOptionDeclaration> {
const result: MapOf<PullCommandOptions, CommandOptionDeclaration> = new Map();
return result;
}
public async run(environment: CommandEnvironment, opts: CommandOptions<unknown>): Promise<void> {
// let options = opts as unknown as PullCommandOptions;
let window = global as any;
window.rocks = rocks;
const configPath = path.resolve('log.conf');
const configPathBuffer = Buffer.from(configPath);
rocks.logger_config(configPathBuffer);
rocks.logger_start();
let db = window.db = rocks.rocksdb_open(Buffer.from('test.db', 'utf-8'), rocks.rocksdb_options_init());
let readOptions = window.readOptions = rocks.rocksdb_read_options_init();
let writeOptions = window.writeOptions = rocks.rocksdb_write_options_init();
// ===== The line below launches the C++ method
rocks.rocksdb_put(db, Buffer.from('Zookie'), Buffer.from('Cookie'), writeOptions);
// ===== The line above launches the C++ method
console.log(rocks.rocksdb_get(db, Buffer.from('Zookie'), readOptions).toString());
let batch: Unique<BatchContext> | null = rocks.rocksdb_batch_init();
rocks.rocksdb_batch_put(batch, Buffer.from('Cookie'), Buffer.from('Zookie'));
rocks.rocksdb_batch_put(batch, Buffer.from('Pookie'), Buffer.from('Zookie'));
rocks.rocksdb_batch_put(batch, Buffer.from('Zookie'), Buffer.from('Zookie'));
rocks.rocksdb_batch_put(batch, Buffer.from('Hookie'), Buffer.from('Zookie'));
await rocks.rocksdb_batch_write_async(db, batch, writeOptions);
batch = null;
let proceed = true;
while (proceed) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
}
Basically this code represents an implementation of KeyValueDatabase->Get("Some key") method, you pass a string into it you get a string in return. But it's obvious the issue is dancing around new std::string() call, I thought that I might get some explanations regarding why it's bad to go this way? How it is possible to move string value without a copy from one string into another?
But when I introduce an optimization that reduces one extra memcpy
It's unclear which extra memcpy you think you are optimizing out.
If the string is short, and you are using std::string with short-string optimization, then indeed you will optimize out a short memcpy. However, dynamically allocating and then deleting std::string is likely much more expensive than the memcpy.
If the string is long, you don't actually optimize anything at all, and instead make the code slower for no reason.
I get segfaults:
The fact that adding v = new std::string; ... ; delete v; introduces a SIGSEGV is a likely indication that you have some other heap corruption going on, which remains unnoticed until you shift things a bit. Valgrind is your friend.
TL;DR: always remember that std::vector needs to move your data around when it grows, which invalidates any pointers you still have floating around.
I've googled around for this problem a bit, and it seems every case I came across was a question of calling delete on the same pointer twice. I'm writing a small program and I'm getting heap corruption, but the only thing doing heap allocation is the c++ standard library. I have a hunch I'm leaking a reference to a local variable or done something wrong with polymorphism, but I can't figure it out.
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;
struct Project;
struct Solution;
struct Line {
string command;
vector<string> params;
void print(ostream &os) {
os << command << ": ";
for (string s : params)
os << s << ' ';
os << endl;
}
};
struct Properties {
vector<string> includes;
vector<string> libPaths;
vector<string> libs;
vector<string> sources;
vector<string> headers;
vector<Project *> depends;
string folder;
string name;
string type;
};
struct Project : Properties {
Project() { built = false; }
bool built;
void build() {
if (built)
return;
built = true;
for (Project *p : depends)
p->build();
cout << "Building project: " << name << endl;
}
};
struct Solution : Properties {
public:
Project *getProject(const string &name) {
for (Project &p : projects) {
if (p.name == name)
return &p;
}
// No project with such a name -- create it
Project p;
cout << &p << endl;
p.name = name;
projects.push_back(p);
cout << "Created project: " << name << endl;
return getProject(name);
}
private:
vector<Project> projects;
};
Line parseLine(const string &strline) {
istringstream stream(strline);
Line line;
stream >> line.command;
while (stream.good()) {
string tok;
stream >> tok;
if (tok.length() > 0)
line.params.push_back(tok);
}
return line;
}
template <typename T>
vector<T> concat(const vector<T> &a, const vector<T> &b) {
vector<T> vec;
for (T obj : a)
vec.push_back(obj);
for (T obj : b)
vec.push_back(obj);
return vec;
}
template <typename T>
void printVector(ostream os, vector<T> v) {
for (T obj : v)
os << obj;
os << endl;
}
int main(int argc, char *argv[]) {
Solution solution;
Properties *properties = &solution;
ifstream stream("testproj.txt");
Project p[100]; // No error here....
string linestr;
for (int lineNum = 1; getline(stream, linestr); lineNum++) {
Line line = parseLine(linestr);
if (line.command == "solution") {
// Make future commands affect the solution
properties = &solution;
} else if (line.command == "exe" || line.command == "lib") {
if (line.params.size() != 1) {
cerr << "Error at line " << lineNum << endl;
return 1;
}
// Make future commands affect this project
properties = solution.getProject(line.params[0]);
properties->type = line.command;
properties->name = line.params[0];
} else if (line.command == "includes") {
properties->includes = concat(properties->includes, line.params);
} else if (line.command == "libpath") {
properties->libPaths = concat(properties->libPaths, line.params);
} else if (line.command == "libs") {
properties->libs = concat(properties->libs, line.params);
} else if (line.command == "folder") {
if (line.params.size() != 1) {
cerr << "Error at line " << lineNum << endl;
return 1;
}
properties->folder = line.params[0];
} else if (line.command == "source") {
properties->sources = concat(properties->sources, line.params);
} else if (line.command == "header") {
properties->headers = concat(properties->headers, line.params);
} else if (line.command == "depends") {
Project *proj;
for (string projName : line.params) {
proj = solution.getProject(projName);
properties->depends.push_back(proj);
}
}
}
}
The error:
HEAP: Free Heap block 00395B68 modified at 00395BAC after it was freed
Here is my stack trace (sorry no line numbers in the source above):
crashes in malloc & ntdll somewhere up here
libstdc++ ---- incomprehensible name mangling
main.cpp, line 24 (inside Properties::Properties()): (compiler-generated constructor)
main.cpp, line 37 (inside Project::Project()): Project() { built = false; }
main.cpp, line 62 (inside Solution::getProject()): Project p;
main.cpp, line 150 (inside main()): proj = solution.getProject(projName);
It seems to be crashing in the default constructor for Properties? Perhaps while constructing a vector?
Edit:
The input file, if it would help:
solution
includes deps/include deps/include/SDL2
libpath deps/lib
libs opengl32 glu32 SDL2main SDL2 libpng16 glew
exe game
folder game
source main.cpp
depends render common
lib render
folder render
source Shader.cpp
header TODO
depends common
lib common
folder common
source util.cpp
header TODO
This is a lot of code, but one strong possibility is that you are de-referencing one of the pointers returned by getProject, but this has been invalidated because the vector projects, that holds the objects pointed to, has performed a re-allocation. This invalidates all pointers, references and iterators.
When you do this:
projects.push_back(p);
projects may need to grow, which results in a re-allocation and the invalidation of pointers mentioned above.
Without looking into the code in any depth, it looks like you can implement Solution quite trivially by using an std::map:
struct Solution : Properties
{
public:
// Check for project with name "name"
// Add one if it doesn't exist
// return it
Project& getProject(const std::string& name)
{
if (!projects.count(name))
{
projects[name].name = name;
}
return projects[name];
}
// Return project with name "name" or raise exception
// if it doesn't exist
const Project& getProject(const string &name) const
{
return projects.at(name);
}
private:
std::map<std::string, Project> projects;
};
I am trying to write a logger class for my C++ calculator, but I'm experiencing a problem while trying to push a string into a list.
I have tried researching this issue and have found some information on this, but nothing that seems to help with my problem. I am using a rather basic C++ compiler, with little debugging utilities and I've not used C++ in quite some time (even then it was only a small amount).
My code:
#ifndef _LOGGER_H_
#define _LOGGER_H_
#include <iostream>
#include <list>
#include <string>
using std::cout;
using std::cin;
using std::endl;
using std::list;
using std::string;
class Logger
{
private:
list<string> mEntries;
public:
Logger() {}
~Logger() {}
// Public Methods
void WriteEntry(const string& entry)
{
mEntries.push_back(entry);
}
void DisplayEntries()
{
cout << endl << "**********************" << endl
<< "* Logger Entries *" << endl
<< "**********************" << endl
<< endl;
for(list<string>::iterator it = mEntries.begin();
it != mEntries.end(); it++)
{
// *** BELOW LINE IS MARKED WITH THE ERROR ***
cout << *it << endl;
}
}
};
#endif
I am calling the WriteEntry method by simply passing in a string, like so:
mLogger->WriteEntry("Testing");
Any advice on this would be greatly appreciated.
* CODE ABOVE HAS BEEN ALTERED TO HOW IT IS NOW *
Now, the line:
cout << *it << endl;
causes the same error. I'm assuming this has something to do with how I am trying to get the string value from the iterator.
The code I am using to call it is in my main.cpp file:
#include <iostream>
#include <string>
#include <sstream>
#include "CommandParser.h"
#include "CommandManager.h"
#include "Exceptions.h"
#include "Logger.h"
using std::string;
using std::stringstream;
using std::cout;
using std::cin;
using std::endl;
#define MSG_QUIT 2384321
#define SHOW_LOGGER true
void RegisterCommands(void);
void UnregisterCommands(void);
int ApplicationLoop(void);
void CheckForLoggingOutput(void);
void ShowDebugLog(void);
// Operations
double Operation_Add(double* params);
double Operation_Subtract(double* params);
double Operation_Multiply(double* params);
double Operation_Divide(double* params);
// Variable
CommandManager *mCommandManager;
CommandParser *mCommandParser;
Logger *mLogger;
int main(int argc, const char **argv)
{
mLogger->WriteEntry("Registering commands...\0");
// Make sure we register all commands first
RegisterCommands();
mLogger->WriteEntry("Command registration complete.\0");
// Check the input to see if we're using the program standalone,
// or not
if(argc == 0)
{
mLogger->WriteEntry("Starting application message pump...\0");
// Full version
int result;
do
{
result = ApplicationLoop();
} while(result != MSG_QUIT);
}
else
{
mLogger->WriteEntry("Starting standalone application...\0");
// Standalone - single use
// Join the args into a string
stringstream joinedStrings(argv[0]);
for(int i = 1; i < argc; i++)
{
joinedStrings << argv[i];
}
mLogger->WriteEntry("Parsing argument '" + joinedStrings.str() + "'...\0");
// Parse the string
mCommandParser->Parse(joinedStrings.str());
// Get the command names from the parser
list<string> commandNames = mCommandParser->GetCommandNames();
// Check that all of the commands have been registered
for(list<string>::iterator it = commandNames.begin();
it != commandNames.end(); it++)
{
mLogger->WriteEntry("Checking command '" + *it + "' is registered...\0");
if(!mCommandManager->IsCommandRegistered(*it))
{
// TODO: Throw exception
mLogger->WriteEntry("Command '" + *it + "' has not been registered.\0");
}
}
// Get each command from the parser and use it's values
// to invoke the relevant command from the manager
double results[commandNames.size()];
int currentResultIndex = 0;
for(list<string>::iterator name_iterator = commandNames.begin();
name_iterator != commandNames.end(); name_iterator++)
{
string paramString = mCommandParser->GetCommandValue(*name_iterator);
list<string> paramStringArray = StringHelper::Split(paramString, ' ');
double params[paramStringArray.size()];
int index = 0;
for(list<string>::iterator param_iterator = paramStringArray.begin();
param_iterator != paramStringArray.end(); param_iterator++)
{
// Parse the current string to a double value
params[index++] = atof(param_iterator->c_str());
}
mLogger->WriteEntry("Invoking command '" + *name_iterator + "'...\0");
results[currentResultIndex++] =
mCommandManager->InvokeCommand(*name_iterator, params);
}
// Output all results
for(int i = 0; i < commandNames.size(); i++)
{
cout << "Result[" << i << "]: " << results[i] << endl;
}
}
mLogger->WriteEntry("Unregistering commands...\0");
// Make sure we clear up our resources
UnregisterCommands();
mLogger->WriteEntry("Command unregistration complete.\0");
if(SHOW_LOGGER)
{
CheckForLoggingOutput();
}
system("PAUSE");
return 0;
}
void RegisterCommands()
{
mCommandManager = new CommandManager();
mCommandParser = new CommandParser();
mLogger = new Logger();
// Known commands
mCommandManager->RegisterCommand("add", &Operation_Add);
mCommandManager->RegisterCommand("sub", &Operation_Subtract);
mCommandManager->RegisterCommand("mul", &Operation_Multiply);
mCommandManager->RegisterCommand("div", &Operation_Divide);
}
void UnregisterCommands()
{
// Unregister each command
mCommandManager->UnregisterCommand("add");
mCommandManager->UnregisterCommand("sub");
mCommandManager->UnregisterCommand("mul");
mCommandManager->UnregisterCommand("div");
// Delete the logger pointer
delete mLogger;
// Delete the command manager pointer
delete mCommandManager;
// Delete the command parser pointer
delete mCommandParser;
}
int ApplicationLoop()
{
return MSG_QUIT;
}
void CheckForLoggingOutput()
{
char answer = 'n';
cout << endl << "Do you wish to view the debug log? [y/n]: ";
cin >> answer;
switch(answer)
{
case 'y':
ShowDebugLog();
break;
}
}
void ShowDebugLog()
{
mLogger->DisplayEntries();
}
// Operation Definitions
double Operation_Add(double* values)
{
double accumulator = 0.0;
// Iterate over all values and accumulate them
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator += values[i];
}
// Return the result of the calculation
return accumulator;
}
double Operation_Subtract(double* values)
{
double accumulator = 0.0;
// Iterate over all values and negativel accumulate them
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator -= values[i];
}
// Return the result of the calculation
return accumulator;
}
double Operation_Multiply(double* values)
{
double accumulator = 0.0;
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator *= values[i];
}
// Return the value of the calculation
return accumulator;
}
double Operation_Divide(double* values)
{
double accumulator = 0.0;
for(int i = 0; i < (sizeof values) - 1; i++)
{
accumulator /= values[i];
}
// Return the result of the calculation
return accumulator;
}
Did you remember to call mLogger = new Logger at some point? Did you accidantally delete mLogger before writing to it?
Try running your program in valgrind to see whether it finds any memory errors.
After your edit, the solution seem clear:
Your first line in main() is :
mLogger->WriteEntry("Registering commands...\0");
Here mLogger is a pointer that has never been initialized. This is "undefined behaviour", meaning anything can appen, often bad things.
To fix this you can either make it a "normal" variable, not a pointer or create a Logger instance using new (either at the declaration or as the first line in main).
I suggest you to not use a pointer to be sure the logger is always there and is automatically destroyed.
By the way, it seems like you want to create every instance of objects on the heap using pointers. It's not recommanded if it's not necessary. You should use pointers ONLY if you want to explicitely state the creation (using new) and destruction (using delete) of the instance object. If you just need it in a specific scope, don't use a pointer. You might come from another language like Java or C# where all objects are referenced. If so, you should start learning C++ like a different language to avoid such kind of problem. You should learn about RAII and other C++ scpecific paradigm that you cannot learn in those languages. If you come from C you should too take it as a different language. That might help you avoid complex problems like the one you showed here. May I suggest you read some C++ pointer, references and RAII related questions on stackoverflow.
First, you don't need to create the std::list on the heap. You should just use it as a normal member of the class.
class Logger
{
private:
list<string> mEntries; // no need to use a pointer
public:
Logger() // initialization is automatic, no need to do anything
{
}
~Logger() // clearing and destruction is automatic too, no need to do anything
{
}
//...
};
Next, entryData don't exist in this code so I guess you wanted to use entry. If it's not a typo then you're not providing the definition of entryData that is certainly the source of your problem.
In fact I would have written your class that way instead:
class Logger
{
private:
list<string> mEntries;
public:
// no need for constructor and destructor, use the default ones
// Public Methods
void WriteEntry(const string& entry) // use a const reference to avoid unnecessary copy (even with optimization like NRVO)
{
mEntries.push_back( entry ); // here the list will create a node with a string inside, so this is exactly like calling the copy constructor
}
void DisplayEntries()
{
cout << endl << "**********************" << endl
<< "* Logger Entries *" << endl
<< "**********************" << endl
<< endl;
for(list<string>::iterator it = mEntries.begin();
it != mEntries.end(); ++it) // if you want to avoid unnecessary copies, use ++it instead of it++
{
cout << *it << endl;
}
}
};
What's certain is that your segfault is from usage outside of this class.
Is an instance of Logger being copied anywhere (either through a copy constructor or operator=)? Since you have mEntries as a pointer to a list, if you copy an instance of Logger, they will share the value of the pointer, and when one is destructed, it deletes the list. The original then has a dangling pointer. A quick check is to make the copy constructor and operator= private and not implemented:
private:
void operator=(const Logger &); // not implemented
Logger(const Logger &); // not implemented
When you recompile, the compiler will flag any copies of any Logger instances.
If you need to copy instances of Logger, the fix is to follow the Rule of 3:
http://en.wikipedia.org/wiki/Rule_of_three_%28C%2B%2B_programming%29
You can do this by eliminating the need for the destructor (by not using a pointer: list<string> mEntries), or by adding the needed code to the copy constructor and operator= to make a deep copy of the list.
You only need to do
list<string> entries;
entries.push_back();
You do not need to create a pointer to entries.
Nothing too obvious, though you typed
mEntries->push_back(string(entryData));
and I htink you meant entry instead of entryData. You also don't need the string conversion on that line, and your function should take entry by const reference.
However, none of these things would cause your program to segfault. What compiler are you using?
You're missing the copy constructor. If the Logger object is copied and the original deleted, you'll be dereferencing memory that was previously deleted.
A simplified example of the problem
Logger a;
{
Logger b;
a=b;
}
a.WriteEntry("Testing");
Add a copy constructor.
Logger(const Logger& item)
{
mEntries = new list<string>();
std::copy(item.mEntries->begin(), item.mEntries->end(), std::back_inserter(*mEntries));
}