BK-Tree Implementation Insertion time is more how to reduce - c++

Following is my attempt to write BK-Tree , for 150000 word file it takes around 8 seconds
Is there any way to reduce this time.
Following is my code
#include <stdio.h>
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
#include <sstream>
#include "Timer.h"
class BkTree {
public:
BkTree();
~BkTree();
void insert(std::string m_item);
private:
size_t EditDistance( const std::string &s, const std::string &t );
struct Node {
std::string m_item;
size_t m_distToParent;
Node *m_firstChild;
Node *m_nextSibling;
Node(std::string x, size_t dist);
~Node();
};
Node *m_root;
int m_size;
protected:
};
BkTree::BkTree() {
m_root = NULL;
m_size = 0;
}
BkTree::~BkTree() {
if( m_root )
delete m_root;
}
BkTree::Node::Node(std::string x, size_t dist) {
m_item = x;
m_distToParent = dist;
m_firstChild = m_nextSibling = NULL;
}
BkTree::Node::~Node() {
if( m_firstChild )
delete m_firstChild;
if( m_nextSibling )
delete m_nextSibling;
}
void BkTree::insert(std::string m_item) {
if( !m_root ){
m_size = 1;
m_root = new Node(m_item, -1);
return;
}
Node *t = m_root;
while( true ) {
size_t d = EditDistance( t->m_item, m_item );
if( !d )
return;
Node *ch = t->m_firstChild;
while( ch ) {
if( ch->m_distToParent == d ) {
t = ch;
break;
}
ch = ch->m_nextSibling;
}
if( !ch ) {
Node *newChild = new Node(m_item, d);
newChild->m_nextSibling = t->m_firstChild;
t->m_firstChild = newChild;
m_size++;
break;
}
}
}
size_t BkTree::EditDistance( const std::string &left, const std::string &right ) {
size_t asize = left.size();
size_t bsize = right.size();
std::vector<size_t> prevrow(bsize+1);
std::vector<size_t> thisrow(bsize+1);
for(size_t i = 0; i <= bsize; i++)
prevrow[i] = i;
for(size_t i = 1; i <= asize; i ++) {
thisrow[0] = i;
for(size_t j = 1; j <= bsize; j++) {
thisrow[j] = std::min(prevrow[j-1] + size_t(left[i-1] != right[j-1]),
1 + std::min(prevrow[j],thisrow[j-1]) );
}
std::swap(thisrow,prevrow);
}
return prevrow[bsize];
}
void trim(std::string& input_str) {
if(input_str.empty()) return;
size_t startIndex = input_str.find_first_not_of(" ");
size_t endIndex = input_str.find_last_not_of("\r\n");
std::string temp_str = input_str;
input_str.erase();
input_str = temp_str.substr(startIndex, (endIndex-startIndex+ 1) );
}
int main( int argc, char **argv ) {
BkTree *pDictionary = new BkTree();
std::ifstream dictFile("D:\\dictionary.txt");
Timer *t = new Timer("Time Taken to prepare Tree = ");
std::string line;
if (dictFile.is_open()) {
while (! dictFile.eof() ) {
std::getline (dictFile,line);
trim(line);
pDictionary->insert(line);
}
dictFile.close();
}
delete t;
delete pDictionary;
return 0;
}
class Timer {
public:
Timer (const std::string &name = "undef");
~Timer (void);
private:
std::string m_name;
std::clock_t m_started;
protected:
};
Timer::Timer (const std::string &name) : m_name(name), m_started(clock()) {
}
Timer::~Timer (void) {
double secs = static_cast<double>(std::clock() - m_started) / CLOCKS_PER_SEC;
std::cout << m_name << ": " << secs << " secs." << std::endl;
}

You can reduce the time by eliminating the I/O. To test your algorithm, remove as many objects out of the equation that are not directly under the control of your program. For example, the OS controls the I/O, it is out of your control. An array of constant text removes much OS involvement (the OS still may page the array depending on OS memory allocation).
Next, most tree structures are data oriented. Their performance times depend on the data. Try three sets of data: sorted ascending, "random", and sorted descending. Note the times for each.
Look at your loops and factor out any constants. Create temporary variables in loops for constant calculations in inner loops. Remove unnecessary operations.
Lastly, if your program and algorithm is very robust, work on other projects. Optimize only if necessary.

Related

Improving usage of C++ vector access

I am writing currently an importer for Labplot to support BLF files. The importer works fine, but when looking at the performance it is visible that there is a lot of room to improve. It is visible that adding the data to the datacontainer consumes the most of the computational power. I tried already for testing using std::vector, but it has not that big impact.
I am not able to use a static array, because the actual number of messages is unknown (only upper limit is known), because if the dbcParser is not able to parse the message it will be skipped. So at the end of the import I have to resize the array.
Are there any recomandations how to improve the performance of the code?
Definition of v: QVector<const Vector::BLF::ObjectHeaderBase*> v;
bool firstMessageValid = false;
for (const auto ohb : v) {
int id;
std::vector<double> values;
if (ohb->objectType == Vector::BLF::ObjectType::CAN_MESSAGE) {
const auto message = reinterpret_cast<const Vector::BLF::CanMessage*>(ohb);
id = message->id;
m_dbcParser.parseMessage(message->id, message->data, values);
} else if (ohb->objectType == Vector::BLF::ObjectType::CAN_MESSAGE2) {
const auto message = reinterpret_cast<const Vector::BLF::CanMessage2*>(ohb);
id = message->id;
m_dbcParser.parseMessage(message->id, message->data, values);
} else
return 0;
if (values.size() == 0) {
// id is not available in the dbc file, so it is not possible to decode
DEBUG("Unable to decode message: " << id);
continue;
}
uint64_t timestamp;
timeInNS = getTime(ohb, timestamp);
if (convertTimeToSeconds) {
double timestamp_seconds;
if (timeInNS)
timestamp_seconds = (double)timestamp / pow(10, 9); // TimeOneNans
else
timestamp_seconds = (double)timestamp / pow(10, 5); // TimeTenMics
m_DataContainer.setData<double>(0, message_index, timestamp_seconds);
} else
m_DataContainer.setData<qint64>(0, message_index, timestamp);
if (firstMessageValid) {
const auto startIndex = idIndexTable.value(id) + 1; // +1 because of time
for (std::vector<double>::size_type i = 1; i < startIndex; i++) {
const auto prevValue = m_DataContainer.data<double>(i, message_index - 1);
m_DataContainer.setData<double>(i, message_index, prevValue);
}
for (std::vector<double>::size_type i = startIndex; i < startIndex + values.size(); i++) {
m_DataContainer.setData<double>(i, message_index, values.at(i - startIndex));
}
for (std::vector<double>::size_type i = startIndex + values.size(); i < m_DataContainer.size(); i++) {
const auto prevValue = m_DataContainer.data<double>(i, message_index - 1);
m_DataContainer.setData<double>(i, message_index, prevValue);
}
} else {
const auto startIndex = idIndexTable.value(id) + 1; // +1 because of time
for (std::vector<double>::size_type i = 1; i < startIndex; i++) {
m_DataContainer.setData<double>(i, message_index, 0);
}
for (std::vector<double>::size_type i = startIndex; i < startIndex + values.size(); i++) {
m_DataContainer.setData<double>(i, message_index, values.at(i - startIndex));
}
for (std::vector<double>::size_type i = startIndex + values.size(); i < m_DataContainer.size(); i++) {
m_DataContainer.setData<double>(i, message_index, 0);
}
firstMessageValid = true;
}
message_index++;
}
struct DataContainer {
void clear();
template<class T>
void appendVector(QVector<T>* data, AbstractColumn::ColumnMode cm) {
m_dataContainer.push_back(data);
m_columnModes.append(cm);
};
template<class T>
void setData(int indexDataContainer, int indexData, T value) {
auto* v = static_cast<QVector<T>*>(m_dataContainer.at(indexDataContainer));
v->operator[](indexData) = value;
}
template<class T>
T data(int indexDataContainer, int indexData) {
auto* v = static_cast<QVector<T>*>(m_dataContainer.at(indexDataContainer));
return v->at(indexData);
}
int size() const;
const QVector<AbstractColumn::ColumnMode> columnModes() const;
/*!
* \brief dataContainer
* Do not modify outside as long as DataContainer exists!
* \return
*/
std::vector<void*> dataContainer() const;
AbstractColumn::ColumnMode columnMode(int index) const;
const void* datas(int index) const;
bool resize(uint32_t) const;
private:
QVector<AbstractColumn::ColumnMode> m_columnModes;
std::vector<void*> m_dataContainer; // pointers to the actual data containers
};
Edit
Before the loop I resize every vector to the absolute maximum number of messages.
if (convertTimeToSeconds) {
auto* vector = new QVector<double>();
vector->resize(message_counter);
m_DataContainer.appendVector<double>(vector, AbstractColumn::ColumnMode::Double);
} else {
auto* vector = new QVector<qint64>();
vector->resize(message_counter);
m_DataContainer.appendVector<qint64>(vector, AbstractColumn::ColumnMode::BigInt); // BigInt is qint64 and not quint64!
}
for (int i = 0; i < vectorNames.length(); i++) {
auto* vector = new QVector<double>();
vector->resize(message_counter);
m_DataContainer.appendVector(vector, AbstractColumn::ColumnMode::Double);
}
During parsing I discard messages if I am not able to parse them, therefore message_index <= message_counter. So when having 100k messages, but I parse only 50k of them, I have to resize the array at the end to not waste memory.
m_DataContainer.resize(message_index);
Edit2
Replacing
auto* v = static_cast<QVector<T>*>(m_dataContainer.at(indexDataContainer));
v->operator[](indexData) = value;
by
static_cast<QVector<T>*>(m_dataContainer.at(indexDataContainer))->operator[](indexData) = value;
and replacing
auto* v = static_cast<QVector<T>*>(m_dataContainer.at(indexDataContainer));
return v->at(indexData);
by
return static_cast<QVector<T>*>(m_dataContainer.at(indexDataContainer))->at(indexData);
brought about 20%. I thought it will be optimized out at -O2 but was not.
With -O2 moving from QVector to std::vector was again an improvement of around 25%

Save time and date in C++

I wrote this program:
#include <iostream>
#include <string>
#include <ctime>
#include <sstream>
#include <time.h>
#include "TextTable.h"
using namespace std;
int command();
void new_car();
void print();
int c=0;
int c1=0;
char f;
int i=0;
int size1=0;
TextTable t( '-', '|', '*' );
struct car
{
string car_name;
string car_owner_name;
int year;
string car_paint;
string car_performance;
string car_problem;
int time;
};
car *a = NULL;
car *p;
int main ()
{
cout<<"welcome to car repair shop program. to help , press h."<<endl;
command();
}
int command(){
cout<<"admin#car_repair_shop_program # ";
cin>>f;
switch(f)
{
case 'n':
new_car();
break;
case 'h':
cout<<endl<<"help"<<endl<<"p : print"<<endl<<"n : new"<<endl<<"h : help"<<endl<<"q : quit"<<endl;
command();
break;
case 'p':
print();
break;
case 'q':
char tmp;
cout<<"Are you sure you want to quit? (y or n): ";
cin>>tmp;
switch(tmp){
case 'y':
delete [] a;
delete [] p;
return 0;
break;
case 'n':
command();
break;
default:
cout << "error! Please try again"<<endl;
command();
}
default:
cout << "error! Please try again"<<endl;
command();
}
}
void new_car()
{
c++;
string car_name;
string car_owner_name;
int year;
string car_paint;
string car_performance;
string car_problem;
int time;
p = new car[++size1];
if (c==1){
a = new car [size1-1];
}
cout<<"enter car name: ";
cin>>car_name;
cout<<endl<<"enter car owner name: ";
cin>>car_owner_name;
cout<<endl<<"enter car paint: ";
cin>>car_paint;
cout<<endl<<"enter car performance: ";
cin>>car_performance;
cout<<endl<<"enter car problem: ";
cin>>car_problem;
cout<<endl<<"enter time: ";
cin>>time;
cout<<endl<<"enter year: ";
cin>>year;
for(int i = 0 ; i < size1-1 ; ++i)
{
p[i].car_name = a[i].car_name;
p[i].car_owner_name = a[i].car_owner_name;
p[i].car_paint = a[i].car_paint;
p[i].car_performance = a[i].car_performance;
p[i].car_problem = a[i].car_problem;
p[i].time = a[i].time;
p[i].year = a[i].year;
}
delete [] a;
a = p;
a[size1-1].car_name=car_name;
a[size1-1].car_owner_name=car_owner_name;
a[size1-1].car_paint=car_paint;
a[size1-1].car_performance=car_performance;
a[size1-1].car_problem=car_problem;
a[size1-1].time=time;
a[size1-1].year=year;
cout<<"OK!"<<endl;
command();
}
void print()
{
c1++;
if (c1 == 1){
t.add( " car name " );
t.add( " car owner name " );
t.add( " car paint " );
t.add( " car performance " );
t.add( " car problem " );
t.add( " time " );
t.add( " year " );
t.endOfRow();
}
string tmp;
for (;i<size1;){
t.add(p[i].car_name);
t.add(p[i].car_owner_name);
t.add(p[i].car_paint);
t.add(p[i].car_performance);
t.add(p[i].car_problem);
tmp = to_string(p[i].time);
t.add(tmp);
tmp = to_string(p[i].year);
t.add(tmp);
t.endOfRow();
t.setAlignment( i, TextTable::Alignment::LEFT );
i+=1;
}
cout << t;
command();
}
But I can not do this part of the project:
"Check what appropriate Data Type provided by the C/C++ library can be used to store time and date information in the program above, and rewrite your program using these tools" I need to get the time and year variables.
The text of the TextTable.h file also contains the following:
#pragma once
#include <iostream>
#include <map>
#include <iomanip>
#include <vector>
#include <string>
#ifdef TEXTTABLE_ENCODE_MULTIBYTE_STRINGS
#include <clocale>
#ifndef TEXTTABLE_USE_EN_US_UTF8
#define TEXTTABLE_USE_EN_US_UTF8
#endif
#endif
class TextTable {
public:
enum class Alignment { LEFT, RIGHT };
typedef std::vector< std::string > Row;
TextTable() :
_horizontal( '-' ),
_vertical( '|' ),
_corner( '+' ),
_has_ruler(true)
{}
TextTable( char horizontal, char vertical, char corner ) :
_horizontal( horizontal ),
_vertical( vertical ),
_corner( corner ),
_has_ruler(true)
{}
explicit TextTable( char vertical ) :
_horizontal( '\0' ),
_vertical( vertical ),
_corner( '\0' ),
_has_ruler( false )
{}
void setAlignment( unsigned i, Alignment alignment )
{
_alignment[ i ] = alignment;
}
Alignment alignment( unsigned i ) const
{ return _alignment[ i ]; }
char vertical() const
{ return _vertical; }
char horizontal() const
{ return _horizontal; }
void add( std::string const & content )
{
_current.push_back( content );
}
void endOfRow()
{
_rows.push_back( _current );
_current.assign( 0, "" );
}
template <typename Iterator>
void addRow( Iterator begin, Iterator end )
{
for( auto i = begin; i != end; ++i ) {
add( * i );
}
endOfRow();
}
template <typename Container>
void addRow( Container const & container )
{
addRow( container.begin(), container.end() );
}
std::vector< Row > const & rows() const
{
return _rows;
}
void setup() const
{
determineWidths();
setupAlignment();
}
std::string ruler() const
{
std::string result;
result += _corner;
for( auto width = _width.begin(); width != _width.end(); ++ width ) {
result += repeat( * width, _horizontal );
result += _corner;
}
return result;
}
int width( unsigned i ) const
{ return _width[ i ]; }
bool has_ruler() const { return _has_ruler;}
int correctDistance(std::string string_to_correct) const
{
return static_cast<int>(string_to_correct.size()) - static_cast<int>(glyphLength(string_to_correct));
};
private:
const char _horizontal;
const char _vertical;
const char _corner;
const bool _has_ruler;
Row _current;
std::vector< Row > _rows;
std::vector< unsigned > mutable _width;
std::vector< unsigned > mutable _utf8width;
std::map< unsigned, Alignment > mutable _alignment;
static std::string repeat( unsigned times, char c )
{
std::string result;
for( ; times > 0; -- times )
result += c;
return result;
}
unsigned columns() const
{
return _rows[ 0 ].size();
}
unsigned glyphLength( std::string s ) const
{
unsigned int _byteLength = s.length();
#ifdef TEXTTABLE_ENCODE_MULTIBYTE_STRINGS
#ifdef TEXTTABLE_USE_EN_US_UTF8
std::setlocale(LC_ALL, "en_US.utf8");
#else
#error You need to specify the encoding if the TextTable library uses multybyte string encoding!
#endif
unsigned int u = 0;
const char *c_str = s.c_str();
unsigned _glyphLength = 0;
while(u < _byteLength)
{
u += std::mblen(&c_str[u], _byteLength - u);
_glyphLength += 1;
}
return _glyphLength;
#else
return _byteLength;
#endif
}
void determineWidths() const
{
_width.assign( columns(), 0 );
_utf8width.assign( columns(), 0 );
for ( auto rowIterator = _rows.begin(); rowIterator != _rows.end(); ++ rowIterator ) {
Row const & row = * rowIterator;
for ( unsigned i = 0; i < row.size(); ++i ) {
_width[ i ] = _width[ i ] > glyphLength(row[ i ]) ? _width[ i ] : glyphLength(row[ i ]);
}
}
}
void setupAlignment() const
{
for ( unsigned i = 0; i < columns(); ++i ) {
if ( _alignment.find( i ) == _alignment.end() ) {
_alignment[ i ] = Alignment::LEFT;
}
}
}
};
inline std::ostream & operator<<( std::ostream & stream, TextTable const & table )
{
table.setup();
if (table.has_ruler()) {
stream << table.ruler() << "\n";
}
for ( auto rowIterator = table.rows().begin(); rowIterator != table.rows().end(); ++ rowIterator ) {
TextTable::Row const & row = * rowIterator;
stream << table.vertical();
for ( unsigned i = 0; i < row.size(); ++i ) {
auto alignment = table.alignment( i ) == TextTable::Alignment::LEFT ? std::left : std::right;
// std::setw( width ) works as follows: a string which goes in the stream with byte length (!) l is filled with n spaces so that l+n=width.
// For a utf8 encoded string the glyph length g might be smaller than l. We need n spaces so that g+n=width which is equivalent to g+n+l-l=width ==> l+n = width+l-g
// l-g (that means glyph length minus byte length) has to be added to the width argument.
// l-g is computed by correctDistance.
stream << std::setw( table.width( i ) + table.correctDistance(row[ i ])) << alignment << row[ i ];
stream << table.vertical();
}
stream << "\n";
if (table.has_ruler()) {
stream << table.ruler() << "\n";
}
}
return stream;
}
What appropriate Data Type provided by the C/C++ library can be used?
You might want to look up std::time on a site like http://www.cppreference.com. There is a lot of information there.

Printing from Trie

I have built a trie in C++ designed to hold words of sentences. Each sentence will have a weight which determines the order in which they should be output. I have several recursive functions that call other recursive functions, and the dilemma I am facing is that I want to print my list only once.
Basically my get function calls the printFromNode function which creates the vector of pairs p that I want to sort and print. If someone could point me in the right direction in how to do that it would be much appreciated.
Code:
Trie.cpp:
//#include "Trie.h"
#include <iostream>
#include <cstdlib>
#include <cstring>
#include <string>
#include <vector>
#include <sstream>
#include <stack>
using namespace std;
class Node
{
private:
string word = "";
bool endOfSentence = false;
int weight = -1;
public:
vector<Node> children = {};
Node() {
this->setWord("");
}
Node(string s){
this->setWord(s);
}
string getWord(){
return this->word;
}
void setWord(string s) {
this->word = s;
}
void setEOS(){
this->endOfSentence = true;
}
void setWeight(int weight){
this->weight = weight;
}
int getWeight() {
return this->weight;
}
};
class Trie
{
public:
Node root;
void add(vector<string> phrase, int weight, Node* n){
Node* current = n;
int w = weight;
int found = -1;
for (int i = 0; i < current->children.size(); i++) {
if (phrase[0] == current->children[i].getWord()) {
found = i;
}
}
if (found > -1) {
current = &current->children[found];
phrase.erase(phrase.begin());
add(phrase, w, current);
}
else {
addPhrase(phrase, w, current);
}
}
void addPhrase(vector<string> phrase, int weight, Node* n) {
Node* current = n;
for (int i = 0; i < phrase.size(); i++) {
Node temp = *new Node(phrase[i]);
current->children.push_back(temp);
current = &current->children.back();
if (i == phrase.size() - 1) {
current->setEOS();
current->setWeight(weight);
}
}
}
void get(vector<string> search) {
Node* current = &this->root;
get(search, current);
}
void get(vector<string> search, Node* n) {
Node* current = n;
int found = -1;
//test search size
if (search.size() == 0) {
cout << "Please enter a valid search" << endl;
}
for (int i = 0; i < current->children.size(); i++) {
if (search[0] == current->children[i].getWord()) {
found = i;
}
}
if (found > -1 && search.size() == 1) {
current = &current->children[found];
printFromNode(*current);
maxNode(*current);
}
else if (found > -1 && search.size() != 1) {
current = &current->children[found];
search.erase(search.begin());
get(search, current);
}
else {
cout << "Not Found" << endl;
}
}
void printOutput(vector<pair<int,string>> p){
sort(p.begin(), p.end());
cout << p.size() << endl;
for (int i = 0; i < p.size(); i++) {
cout << p[i].second << " " << endl;
}
}
void printFromNode(Node n) {
vector<string> phrase = {};
vector <pair < int, string>> final = {};
printFromNode(n,phrase,final);
}
void printFromNode(Node n, vector<string> &v, vector<pair<int,string>> &p) {
string output;
if (n.getWord() == "") {
return;
}
for (int i = 0; i < n.children.size(); i++) {
if (n.children[i].getWeight() > 0) {
for (int i = 0; i < v.size(); i++)
{
output.append(v[i] + " ");
}
output.append(n.children[i].getWord());
p.push_back(make_pair(n.children[i].getWeight(), output));
}
v.push_back(n.children[i].getWord());
printFromNode(n.children[i], v, p);
v.pop_back();
sort(p.begin(), p.end());
}
return;
}
void maxNode(Node n) {
int max = 0;
int index = 0;
int temp = 0;
for (int i = 0; i < n.children.size(); i++) {
temp = n.children[i].children.size();
if (temp > max) {
max = temp;
index = i;
}
}
cout << n.children[index].getWord() << " " << max << endl;
}
};
Main.cpp:
#include "Trie.cpp"
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using namespace std;
int main(int argc, char* argv[]) {
// Initialize trie up here
Trie myTrie = *new Trie();
// parse input lines until I find newline
for(string line; getline(cin, line) && line.compare(""); ) {
stringstream ss(line);
string string_weight;
ss >> string_weight;
int weight = stoi(string_weight);
// I am just going to put these words into a vector
// you probably want to put them in your trie
vector<string> phrase = {};
for(string word; ss >> word;) {
phrase.push_back(word);
}
myTrie.add(phrase, weight, &myTrie.root);
vector<string> ans = {};
}
// parse query line
string query;
getline(cin, query);
stringstream ss(query);
vector<string> search = {};
for (string query; ss >> query;) {
search.push_back(query);
}
myTrie.get(search);
return 0;
}
You can remove recursive methods, and doing something like the following:
#include <algorithm>
#include <iostream>
#include <map>
#include <string>
#include <vector>
#include <set>
class Node
{
public:
bool endOfSentence = false;
std::set<int> weights;
std::map<std::string, Node> children;
Node() = default;
const Node* get(const std::string& word) const
{
auto it = children.find(word);
if (it == children.end()) {
return nullptr;
}
return &it->second;
}
auto find_by_weight(int weight) const
{
return std::find_if(children.begin(),
children.end(),
[=](const auto& p){ return p.second.weights.count(weight);});
}
};
class Trie
{
Node root;
public:
void add(int weight, const std::vector<std::string>& phrase)
{
Node* node = &root;
for (const auto& word : phrase) {
node->weights.insert(weight);
node = &node->children[word];
}
node->weights.insert(weight);
node->endOfSentence = true;
}
bool contains(const std::vector<std::string>& phrase) const
{
const Node* node = &root;
for (const auto& word : phrase) {
node = node->get(word);
if (node == nullptr) {
return false;
}
}
return node->endOfSentence;
}
void print(int weight) const
{
const Node* node = &root;
const char* sep = "";
while (node) {
const auto it = node->find_by_weight(weight);
if (it == node->children.end()) {
break;
}
std::cout << sep << it->first;
sep = " ";
node = &it->second;
}
std::cout << std::endl;
}
void print_all() const
{
for (int i : root.weights) {
print(i);
}
}
};
And usage/Test:
int main(int argc, char* argv[]) {
const std::vector<std::vector<std::string>> sentences = {
{"My", "name", "is", "John"},
{"My", "house", "is", "small"},
{"Hello", "world"},
{"Hello", "world", "!"}
};
Trie trie;
int i = 0;
for (const auto& sentence : sentences) {
trie.add(i, sentence);
++i;
}
const std::vector<std::vector<std::string>> queries = {
{"My", "name", "is", "John"},
{"My", "house"},
{"Hello", "world"}
};
for (const auto& query : queries) {
std::cout << trie.contains(query) << std::endl;
}
trie.print_all();
}
Demo

How do I create a proper Memory Poolfor a Multithreaded vector (LFSV)?

Below is an old exercise for a class that is no longer being taught at my university (Parallel Processing). The goal is to create and use a Memory Bank to speed up the Lock-Free Sorted Vector implementation. I implemented the Memory Bank myself and the goal is to set aside enough memory to use so I do not have to use new or delete in the LFSV. I believe I need a Get() function that returns the address of the memory (not sure how keep track of the unused memory) and Store should free the memory (somehow mark it as unused).
Inside LFSV (which worked perfectly fine before my intervention), the exercise explains that I should replace the new and delete with new replacement and Store(memory we want freed). How do I go about creating the Get(if this is incorrect) or the Store function to perform like a proper memory bank? I will also take any reference or memory bank examples online that you may know of because I am having trouble finding good resources related to memory banks and multithreading.
There are no errors in this program, but it returns as a "FAIL" since I did not properly manage the memory bank.
#include <algorithm>//copy, random_shuffle
#include <ctime> //std::time (NULL) to seed srand
#include <iostream> // std::cout
#include <atomic> // std::atomic
#include <thread> // std::thread
#include <vector> // std::vector
#include <mutex> // std::mutex
#include <deque> // std::deque
class MemoryBank
{
std::deque< std::vector<int>* > slots;
public:
MemoryBank() : slots(10000)
{
for (int i = 0; i<10000; ++i)
{
slots[i] = reinterpret_cast<std::vector<int>*>(new char[sizeof(std::vector<int>)]);
}
}
~MemoryBank()
{
for (unsigned int i = 0; i < slots.size(); ++i)
{
delete slots[i];
}
slots.clear();
}
void * Get()
{
return &slots;
}
void Store(std::vector<int *> freeMemory)
{
return;
}
};
class LFSV {
std::atomic< std::vector<int>* > pdata;
std::mutex wr_mutex;
MemoryBank mb;
public:
LFSV() : mb(), pdata( new (mb.Get()) std::vector<int> ) {}
~LFSV()
{
mb.~MemoryBank();
}
void Insert( int const & v ) {
std::vector<int> *pdata_new = nullptr, *pdata_old;
int attempt = 0;
do {
++attempt;
delete pdata_new;
pdata_old = pdata;
pdata_new = new (mb.Get())std::vector<int>( *pdata_old );
std::vector<int>::iterator b = pdata_new->begin();
std::vector<int>::iterator e = pdata_new->end();
if ( b==e || v>=pdata_new->back() ) { pdata_new->push_back( v ); } //first in empty or last element
else {
for ( ; b!=e; ++b ) {
if ( *b >= v ) {
pdata_new->insert( b, v );
break;
}
}
}
// std::lock_guard< std::mutex > write_lock( wr_mutex );
// std::cout << "insert " << v << "(attempt " << attempt << ")" << std::endl;
} while ( !(this->pdata).compare_exchange_weak( pdata_old, pdata_new ));
// LEAKing pdata_old since "delete pdata_old;" will cause errors
// std::lock_guard< std::mutex > write_lock( wr_mutex );
// std::vector<int> * pdata_current = pdata;
// std::vector<int>::iterator b = pdata_current->begin();
// std::vector<int>::iterator e = pdata_current->end();
// for ( ; b!=e; ++b ) {
// std::cout << *b << ' ';
// }
// std::cout << "Size " << pdata_current->size() << " after inserting " << v << std::endl;
}
int const& operator[] ( int pos ) const {
return (*pdata)[ pos ];
}
};
LFSV lfsv;
void insert_range( int b, int e ) {
int * range = new int [e-b];
for ( int i=b; i<e; ++i ) {
range[i-b] = i;
}
std::srand( static_cast<unsigned int>(std::time (NULL)) );
std::random_shuffle( range, range+e-b );
for ( int i=0; i<e-b; ++i ) {
lfsv.Insert( range[i] );
}
delete [] range;
}
int reader( int pos, int how_many_times ) {
int j = 0;
for ( int i=1; i<how_many_times; ++i ) {
j = lfsv[pos];
}
return j;
}
std::atomic<bool> doread( true );
void read_position_0() {
int c = 0;
while ( doread.load() ) {
std::this_thread::sleep_for( std::chrono::milliseconds( 10 ) );
if ( lfsv[0] != -1 ) {
std::cout << "not -1 on iteration " << c << "\n"; // see main - all element are non-negative, so index 0 should always be -1
}
++c;
}
}
void test( int num_threads, int num_per_thread )
{
std::vector<std::thread> threads;
lfsv.Insert( -1 );
std::thread reader = std::thread( read_position_0 );
for (int i=0; i<num_threads; ++i) {
threads.push_back( std::thread( insert_range, i*num_per_thread, (i+1)*num_per_thread ) );
}
for (auto& th : threads) th.join();
doread.store( false );
reader.join();
for (int i=0; i<num_threads*num_per_thread; ++i) {
// std::cout << lfsv[i] << ' ';
if ( lfsv[i] != i-1 ) {
std::cout << "Error\n";
return;
}
}
std::cout << "All good\n";
}
void test0() { test( 1, 100 ); }
void test1() { test( 2, 100 ); }
void test2() { test( 8, 100 ); }
void test3() { test( 100, 100 ); }
void (*pTests[])() = {
test0,test1,test2,test3//,test4,test5,test6,test7
};
#include <cstdio> /* sscanf */
int main( int argc, char ** argv ) {
if (argc==2) { //use test[ argv[1] ]
int test = 0;
std::sscanf(argv[1],"%i",&test);
try {
pTests[test]();
} catch( const char* msg) {
std::cerr << msg << std::endl;
}
return 0;
}
}
reinterpret_cast is really a "I know what I'm doing, trust me" cast. The compiler will - if possible - believe you.
However, in this case it's entirely wrong. new char[] does not return a vector<int>*.

Odd performance issue with nested for loops

Below is the full source code you can just copy paste into Visual Studio for easy repro.
#include <Windows.h>
#include <algorithm>
#include <vector>
#include <iostream>
#include <sstream>
LARGE_INTEGER gFreq;
struct CProfileData;
// Yes, we map the pointer itself not the string, for performance reasons
std::vector<CProfileData*> gProfileData;
// simulate a draw buffer access to avoid CBlock::Draw being optimized away
float gDrawBuffer = 0;
struct CTimer
{
CTimer()
{
Reset();
}
size_t GetElapsedMicro()
{
LARGE_INTEGER now;
::QueryPerformanceCounter(&now);
return (1000000 * (now.QuadPart - m_timer.QuadPart)) / gFreq.QuadPart;
}
inline void Reset()
{
::QueryPerformanceCounter(&m_timer);
}
LARGE_INTEGER m_timer;
};
struct CProfileData
{
CProfileData() : m_hitCount(0), m_totalTime(0), m_minTime(-1),
m_maxTime(0), m_name(NULL)
{
gProfileData.push_back(this);
}
size_t m_totalTime;
size_t m_minTime;
size_t m_maxTime;
size_t m_hitCount;
const char * m_name;
};
class CSimpleProfiler
{
public:
CSimpleProfiler(const char * aLocationName, CProfileData * aData)
: m_location(aLocationName), m_data(aData)
{
::QueryPerformanceCounter(&m_clock);
}
~CSimpleProfiler()
{
CProfileData & data = *m_data;
data.m_name = m_location;
++data.m_hitCount;
LARGE_INTEGER now;
::QueryPerformanceCounter(&now);
size_t elapsed = (1000000 * (now.QuadPart - m_clock.QuadPart)) / gFreq.QuadPart;
data.m_totalTime += elapsed;
elapsed < data.m_minTime ? data.m_minTime = elapsed : true;
elapsed > data.m_maxTime ? data.m_maxTime = elapsed : true;
}
static void PrintAll()
{
std::stringstream str;
str.width(20);
str << "Location";
str.width(15);
str << "Total time";
str.width(15);
str << "Average time";
str.width(15);
str << "Hit count";
str.width(15);
str << "Min";
str.width(15);
str << "Max" << std::endl;
::OutputDebugStringA(str.str().c_str());
for (auto i = gProfileData.begin(); i != gProfileData.end(); ++i)
{
CProfileData & data = **i;
std::stringstream str;
str.width(20);
str << data.m_name;
str.width(15);
str << data.m_totalTime;
str.width(15);
str << data.m_totalTime / (float)data.m_hitCount;
str.width(15);
str << data.m_hitCount;
str.width(15);
str << data.m_minTime;
str.width(15);
str << data.m_maxTime << std::endl;
::OutputDebugStringA(str.str().c_str());
}
}
static void Clear()
{
for (auto i = gProfileData.begin(); i != gProfileData.end(); ++i)
{
(*i)->m_totalTime = 0;
(*i)->m_minTime = 0;
(*i)->m_maxTime = 0;
(*i)->m_hitCount = 0;
}
}
private:
LARGE_INTEGER m_clock;
const char * m_location;
CProfileData * m_data;
};
#define PROFILING_ENABLED
#ifdef PROFILING_ENABLED
#define SIMPLE_PROFILE \
static CProfileData pdata ## __LINE__; \
CSimpleProfiler p ## __LINE__(__FUNCTION__, & pdata ## __LINE__)
#define SIMPLE_PROFILE_WITH_NAME(Name) \
static CProfileData pdata ## __LINE__; \
CSimpleProfiler p ## __LINE__(Name, & pdata ## __LINE__)
#else
#define SIMPLE_PROFILE __noop
#define SIMPLE_PROFILE_WITH_NAME(Name) __noop
#endif
void InvalidateL1Cache()
{
const int size = 256 * 1024;
static char *c = (char *)malloc(size);
for (int i = 0; i < 0x0fff; i++)
for (int j = 0; j < size; j++)
c[j] = i*j;
}
int _tmain(int argc, _TCHAR* argv[])
{
::QueryPerformanceFrequency(&gFreq);
LARGE_INTEGER pc;
::QueryPerformanceCounter(&pc);
struct CBlock
{
float x;
float y;
void Draw(float aBlend)
{
for (size_t i = 0; i < 100; ++i )
gDrawBuffer += aBlend;
}
};
typedef std::vector<std::vector<CBlock>> Layer;
typedef std::vector<Layer> Layers;
Layers mBlocks;
// populate with dummy data;
mBlocks.push_back(Layer());
Layer & layer = mBlocks.back();
layer.resize(109);
srand(0); // for reprodicibility (determinism)
for (auto i = layer.begin(); i != layer.end(); ++i)
{
i->resize(25 + rand() % 10 - 5);
}
// end populating dummy data
while (1)
{
CSimpleProfiler::Clear();
float aBlend = 1.f / (rand() % 100);
{
for (auto i = mBlocks.begin(); i != mBlocks.end(); ++i)
{
for (auto j = i->begin(); j != i->end(); ++j)
{
CTimer t;
{
SIMPLE_PROFILE_WITH_NAME("Main_Draw_3");
for (auto blockIt = j->begin(); blockIt != j->end();)
{
CBlock * b = nullptr;
{
b = &*blockIt;
}
{
b->Draw(aBlend);
}
{
++blockIt;
}
}
}
if (t.GetElapsedMicro() > 1000)
{
::OutputDebugStringA("SLOWDOWN!\n");
CSimpleProfiler::PrintAll();
}
}
}
}
}
return 0;
}
I get the following profiling from time to time, expressed in microseconds:
SLOWDOWN!
Location Total time Average time Hit count Min Max
Main_Draw_3 2047 36.5536 56 0 1040
This spikes from time to time. Normally, it takes 100 microseconds for Main_Draw_3 block to finish, but it spikes to 1000 (the Max column) from time to time. What causes this?
I'm aware cache misses could play a role, but is it really that in this case?... What is happening here and how can I mitigate this?
More info:
compiler VS 2013, compiled with Maximize Speed (/O2)
I think there might be two issues:
Are you compiling with optimizations on? What are the flags?
Maybe you could increase the sample size (by doing for instance ten (or hundred, or thousand etc) runs of this code in one profiling run). The reason is that if the sample size is small, the standard deviation is very high