c++ testing if one file is older than a set of files - c++

I am creating a cache for some data, but of course I want the cache to become invalid if any of the source files from which the cache is made is modified. TO that effect I made this function:
bool CacheIsValid(
const std::string& cache_shader_path,
const std::vector<std::string>& shader_paths)
{
// This is really messy because of the std stuff, the short of it is:
// Find the youngest file in the shader paths, then check it against the
// timestamp of the cached file.
std::time_t youngest_file_ts = std::time(nullptr);
for(auto file : shader_paths)
{
std::time_t current_timestamp =
std::chrono::system_clock::to_time_t(
std::chrono::file_clock::to_sys(fs::last_write_time(file)));
double time_diff = difftime(youngest_file_ts, current_timestamp);
if(time_diff > 0) youngest_file_ts = current_timestamp;
}
// All this is doing is comparing the youngest file time stamp with the cache.
return fs::exists(cache_shader_path)
&& (difftime(youngest_file_ts, std::chrono::system_clock::to_time_t(
std::chrono::file_clock::to_sys(fs::last_write_time(cache_shader_path)))) < 0);
}
I don't know what I did wrong but that is always returning true even when the input files are modified. I am checking the timestamps using stat and the files on disk are objectively younger than the cache file, I also tested that the inputs to this function are correct, and they are.

It seems to me that you're jumping through a lot of hoops that make this tremendously more difficult than it needs to be. filesystem::last_write_time returns a time_point<someclock>1 , which supports comparison. As such, at least as far as I can see, there's no reason to do the long, drawn-out conversion to time_t, then using difftime to do the comparison.
If I understand the idea of what you're doing, you want to ascertain that the one file is at least as new as any of the files named in the vector. To do that a bit more directly, I'd write code something along this general line:
bool isNewer(std::string const &file, std::vector<std::string> const &otherFiles) {
auto newest = std::max_element(otherFiles.begin(), otherFiles.end(),
[&](std::string const &a, std::string const &b) {
return fs::last_write_time(a) > fs::last_write_time(b);
});
return fs::last_write_time(file) >= fs::last_write_time(*newest);
}
I may have misunderstood the direction you want one or both comparisons done, in which case this may not be right--but if so, changing the comparison(s) to match your requirements should be trivial.
where someclock is some arbitrary type that meets a few specified requirements for C++17, or std::chrono::file_clock for C++20.

due to you want to find youngest_file_ts -> find most recently timestamp (greater number) of a changing file however
double time_diff = difftime(youngest_file_ts, current_timestamp);
if(time_diff > 0) youngest_file_ts = current_timestamp; // find greater number one
after the for loop youngest_file_ts is oldest timestamp of a changing file
therefor
(difftime(youngest_file_ts, std::chrono::system_clock::to_time_t(
std::chrono::file_clock::to_sys(fs::last_write_time(cache_shader_path)))) < 0) alway true.
it should be change like
if (shader_paths.empty()) {
return false
} else {
//initialize youngest_file_ts as last_write_time of first element in shader_paths
std::time_t youngest_file_ts = std::chrono::system_clock::to_time_t(
std::chrono::file_clock::to_sys(fs::last_write_time(shader_paths.at(0)));
for (auto file : shader_paths)
{
std::time_t current_timestamp = std::chrono::system_clock::to_time_t(
std::chrono::file_clock::to_sys(fs::last_write_time(file)));
double time_diff = difftime(youngest_file_ts, current_timestamp);
if (time_diff < 0) youngest_file_ts = current_timestamp;
}
// All this is doing is comparing the youngest file time stamp with the cache.
return fs::exists(cache_shader_path)
&& (difftime(youngest_file_ts, std::chrono::system_clock::to_time_t(
std::chrono::file_clock::to_sys(fs::last_write_time(cache_shader_path)))) < 0);
}

Related

How to declare a std::chrono::duration<double> variable in c++

I am working on a code where I am calculating the duration of time and then saving it in list.
auto start_time = std::chrono::high_resolution_clock::now();
/*
*some code here
*/
auto finish_time = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_duration = finish_time - start_time ;
Now here I need to save time_duration in a list at a given index. If that given index, already contains a value, I need to add time_duration with the current value of that index and save it. For this I have below list and code:
list <std::chrono::duration<double>> dTimeList;
auto iter = dTimeList.begin();
advance(iter, 0);
*iter+= time_duration; //Error at this line
But running above code I get below error:
Most probably this error is coming because I have empty list with no items in it. This is why I thought of adding a item at 0th index like below:
auto itr = dTimeList.begin();
advance(itr, 0);
std::chrono::duration<double> t = 0.0;
dTimeList.insert(itr, t);
but again above is also giving below error. How can I resolve this issue. Thanks
No suitable constructor available to convert double to std::chrono::duration<double>
Probably you should not use list to store data at index. Try to use std::unordered_map instead.
#include <unordered_map>
auto start_time = std::chrono::high_resolution_clock::now();
/*
*some code here
*/
auto finish_time = std::chrono::high_resolution_clock::now();
std::chrono::duration<double> time_duration = finish_time - start_time ;
typedef int YourObjectIdType;
std::unordered_map<YourObjectIdType, std::chrono::duration<double>> dTimes;
and now insert or update item like this:
dTimes[objectId] += time_duration;
or like this:
auto it = dTimes.find(objectId);
if (it != dTimes.end()) {
*it += time_duration;
} else {
dTimes.insert(std::make_pair(objectId, time_duration));
}
also, you can use std::map the same way, it slightly slower, but [begin(), end()) range is sorted.
You're trying to write to the end iterator. When the list is empty, dTimeList.begin() == dTimeList.end(), so you can't write to it.
You need to check if the index exists first, and extend the list otherwise:
void add_time_to_list(
std::list<std::chrono::duration<double>>& l,
const std::chrono::duration<double>& t,
std::size_t index = 0
) {
if (index < l.size()) {
*std::next(l.begin(), index) += t;
} else if (index == l.size()) {
// New element; add to back
l.push_back(t);
} else {
// Here you can throw an error that index is too big
// or append 0 until it gets to the desired size or something
}
}
Note that accessing arbitrary indexes like this could mean that you don't want a std::list. This would be easier with a std::vector (*std::next(l.begin(), index) becomes l[index]). I would also suggest a std::map, where the function can just be written as:
void add_time_to_list(
std::map<std::size_t, std::chrono::duration<double>>& m,
const std::chrono::duration<double>& t,
std::size_t index = 0
) {
m[index] += t; // if m[index] doesn't exist, it is default constructed.
}
No suitable constructor available to convert double to std::chrono::duration
I dare to suggest dealing with std::list is far from a core issue here.
Dealing with std::chrono, one should know the foundation concepts of it. For what is not explained here please look it up on cppreference.com
Step one. You decide which Clock you need/want to use.
using Clock = typename std::chrono::high_resolution_clock ;
Step two. You use the duration nested type, declared in it. In your use-case you would like to have a sequence of durations.
using time_durations_sequence_type = std::vector<Clock::duration> ;
type naming is very important. I deliberately use the generic term 'sequence'. You might push the idea of std::list to implement it, but I fail to see why. Thus I am using std::vector above.
Also notice, I do not use double or long int as the "duration" type. Firstly std::duration is a "time taken for event to happen". Second, it is not a scalar it is a class type. Traditionaly, from C years, time concept was based just on ticks.
So, to cut the long story short we use the std::duration concept , concretized as a nested type in the "clock" we have selected to use here.
// not a point in time
Clock::duration
//
// there is no guarantee above
// is the same type as
// std::chrono::duration<double>
// and we do not need one
And above is all we need to proceed to implement the functionality you require.
// two points in time
auto start_time = Clock::now();
auto finish_time = Clock::now();
// the duration
Clock::duration time_duration = finish_time - start_time ;
Now we store the duration in our sequence type.
time_durations_sequence_type tdst{ time_duration } ;
And elsewhere we use the sequence of stored durations we have accumulated.
// time_durations_sequence_type tdst
for (auto duration : tdst ) {
std::cout << std::endl
<< duration.count() << " nano seconds"
<< std::endl;
}
Notice how above we used the count() method to get to the actual ticks, which by std::chrono default, do represent nanoseconds.
Working code is here .

I need to create MultiMap using hash-table but I get time-limit exceeded error (C++)

I'm trying to solve algorithm task: I need to create MultiMap(key,(values)) using hash-table. I can't use Set and Map libraries. I send code to testing system, but I get time-limit exceeded error on test 20. I don't know what exactly this test contains. The code must do following tasks:
put x y - add pair (x,y).If pair exists, do nothing.
delete x y - delete pair(x,y). If pair doesn't exist, do nothing.
deleteall x - delete all pairs with first element x.
get x - print number of pairs with first element x and second elements.
The amount of operations <= 100000
Time limit - 2s
Example:
multimap.in:
put a a
put a b
put a c
get a
delete a b
get a
deleteall a
get a
multimap.out:
3 b c a
2 c a
0
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
inline long long h1(const string& key) {
long long number = 0;
const int p = 31;
int pow = 1;
for(auto& x : key){
number += (x - 'a' + 1 ) * pow;
pow *= p;
}
return abs(number) % 1000003;
}
inline void Put(vector<vector<pair<string,string>>>& Hash_table,const long long& hash, const string& key, const string& value) {
int checker = 0;
for(int i = 0; i < Hash_table[hash].size();i++) {
if(Hash_table[hash][i].first == key && Hash_table[hash][i].second == value) {
checker = 1;
break;
}
}
if(checker == 0){
pair <string,string> key_value = make_pair(key,value);
Hash_table[hash].push_back(key_value);
}
}
inline void Delete(vector<vector<pair<string,string>>>& Hash_table,const long long& hash, const string& key, const string& value) {
for(int i = 0; i < Hash_table[hash].size();i++) {
if(Hash_table[hash][i].first == key && Hash_table[hash][i].second == value) {
Hash_table[hash].erase(Hash_table[hash].begin() + i);
break;
}
}
}
inline void Delete_All(vector<vector<pair<string,string>>>& Hash_table,const long long& hash,const string& key) {
for(int i = Hash_table[hash].size() - 1;i >= 0;i--){
if(Hash_table[hash][i].first == key){
Hash_table[hash].erase(Hash_table[hash].begin() + i);
}
}
}
inline string Get(const vector<vector<pair<string,string>>>& Hash_table,const long long& hash, const string& key) {
string result="";
int counter = 0;
for(int i = 0; i < Hash_table[hash].size();i++){
if(Hash_table[hash][i].first == key){
counter++;
result += Hash_table[hash][i].second + " ";
}
}
if(counter != 0)
return to_string(counter) + " " + result + "\n";
else
return "0\n";
}
int main() {
vector<vector<pair<string,string>>> Hash_table;
Hash_table.resize(1000003);
ifstream input("multimap.in");
ofstream output("multimap.out");
string command;
string key;
int k = 0;
string value;
while(true) {
input >> command;
if(input.eof())
break;
if(command == "put") {
input >> key;
long long hash = h1(key);
input >> value;
Put(Hash_table,hash,key,value);
}
if(command == "delete") {
input >> key;
input >> value;
long long hash = h1(key);
Delete(Hash_table,hash,key,value);
}
if(command == "get") {
input >> key;
long long hash = h1(key);
output << Get(Hash_table,hash,key);
}
if(command == "deleteall"){
input >> key;
long long hash = h1(key);
Delete_All(Hash_table,hash,key);
}
}
}
How can I do my code work faster?
At very first, a matter of design: Normally, one would pass the key only to the function and calculate the hash within. Your variant allows a user to place elements anywhere within the hash table (using bad hash values), so user could easily break it.
So e. g. put:
using HashTable = std::vector<std::vector<std::pair<std::string, std::string>>>;
void put(HashTable& table, std::string& key, std::string const& value)
{
auto hash = h1(key);
// ...
}
If at all, the hash function could be parametrised, but then you'd write a separate class for (wrapping the vector of vectors) and provide the hash function in constructor so that a user cannot exchange it arbitrarily (and again break the hash table). A class would come with additional benefits, most important: better encapsulation (hiding the vector away, so user could not change it with vector's own interface):
class HashTable
{
public:
// IF you want to provide hash function:
template <typename Hash>
HashTable(Hash hash) : hash(hash) { }
void put(std::string const& key, std::string const& value);
void remove(std::string const& key, std::string const& value); //(delete is keyword!)
// ...
private:
std::vector<std::vector<std::pair<std::string, std::string>>> data;
// if hash function parametrized:
std::function<size_t(std::string)> hash; // #include <functional> for
};
I'm not 100% sure how efficient std::function really is, so for high performance code, you preferrably use your hash function h1 directly (not implenting constructor as illustrated above).
Coming to optimisations:
For the hash key I would prefer unsigned value: Negative indices are meaningless anyway, so why allow them at all? long long (signed or unsigned) might be a bad choice if testing system is a 32 bit system (might be unlikely, but still...). size_t covers both issues at once: it is unsigned and it is selected in size appropriately for given system (if interested in details: actually adjusted to address bus size, but on modern systems, this is equal to register size as well, which is what we need). Select type of pow to be the same.
deleteAll is implemented inefficiently: With each element you erase you move all the subsequent elements one position towards front. If you delete multiple elements, you do this repeatedly, so one single element can get moved multiple times. Better:
auto pos = vector.begin();
for(auto& pair : vector)
{
if(pair.first != keyToDelete)
*pos++ = std::move(s); // move semantics: faster than copying!
}
vector.erase(pos, vector.end());
This will move each element at most once, erasing all surplus elements in one single go. Appart from the final erasing (which you have to do explicitly then), this is more or less what std::remove and std::remove_if from algorithm library do as well. Are you allowed to use it? Then your code might look like this:
auto condition = [&keyToDelete](std::pair<std::string, std::string> const& p)
{ return p.first == keyToDelete; };
vector.erase(std::remove_if(vector.begin(), vector.end(), condition), vector.end());
and you profit from already highly optimised algorithm.
Just a minor performance gain, but still: You can spare variable initialisation, assignment and conditional branch (the latter one can be relatively expensive operation on some systems) within put if you simply return if an element is found:
//int checker = 0;
for(auto& pair : hashTable[hash]) // just a little more comfortable to write...
{
if(pair.first == key && pair.second == value)
return;
}
auto key_value = std::make_pair(key, value);
hashTable[hash].push_back(key_value);
Again, with algorithm library:
auto key_value = std::make_pair(key, value);
// same condition as above!
if(std::find_if(vector.begin(), vector.end(), condition) == vector.end();
{
vector.push_back(key_value);
}
Then less than 100000 operations does not indicate that each operation will require a separate key/value pair. We might expect that keys are added, removed, re-added, ..., so you most likely don't have to cope with 100000 different values. I'd assume your map is much too large (be aware that it requires initialisation of 100000 vectors as well). I'd assume a much smaller one should suffice already (possibly 1009 or 10007? You might possibly have to experiment a little...).
Keeping the inner vectors sorted might give you some performance boost as well:
put: You could use a binary search to find the two elements in between a new one is to be inserted (if one of these two is equal to given one, no insertion, of course)
delete: Use binary search to find the element to delete.
deleteAll: Find upper and lower bounds for elements to be deleted and erase whole range at once.
get: find lower and upper bound as for deleteAll, distance in between (number of elements) is a simple subtraction and you could print out the texts directly (instead of first building a long string). Which of outputting directly or creating a string really is more efficient is to be found out, though, as outputting directly involves multiple system calls, which in the end might cost previously gained performance again...
Considering your input loop:
Checking for eof() (only) is critical! If there is an error in the file, you'll end up in an endless loop, as the fail bit gets set, operator>> actually won't read anything at all any more and you won't ever reach the end of the file. This even might be the reason for your 20th test failing.
Additionally: You have line based input (each command on a separate line), so reading a whole line at once and only afterwards parse it will spare you some system calls. If some argument is missing, you will detect it correctly instead of (illegally) reading next command (e. g. put) as argument, similarly you won't interpret a surplus argument as next command. If a line is invalid for whatever reason (bad number of arguments as above or unknown command), you can then decide indiviually what you want to do (just ignore the line or abort processing entirely). So:
std::string line;
while(std::getline(std::cin, line))
{
// parse the string; if line is invalid, appropriate error handling
// (ignoring the line, exiting from loop, ...)
}
if(!std::cin.eof())
{
// some error occured, print error message!
}

Performance. Look for a substring. substr vs find

Say we have a string var "sA" and I would like to check whether the string "123" is at the end of the sA.
What is better to do and why:
if(sA.length() > 2) sA.substr(sA.length()-3) == "123"
if(sA.length() > 2) sA.find("123", sA.length() -3) != string::npos
Thanks in advance
The second code fragment avoids creation of two temporary objects (one for the "123" converted to std::string, the other for the return value of substr), so in theory it should be faster. However, micro-optimizations of this sort rarely pay off: it is unlikely that you would see a substantial gain from using the second form over the first one if you apply this optimization randomly.
Of course the situation is different if your profiler tells you that your program spends a substantial percentage of its time checking the ending of the string like this; in this case, the optimization will likely help.
If performance is critical, I don't think you can get any faster than this (compared to the other methods, no allocations are necessary):
const char needle[] = "abc";
const char *haystack;
const int len = strlen(haystack);
if (len<sizeof(needle))
return false;
for (int i=0; i<sizeof(needle); i++)
if (needle[i] != haystack[len-sizeof(needle)+i])
return false;
return true;
Obviously various micro-optimizations are possible, but the approach is the fastest I can think of.
A more C++y version, using std::string for the haystack:
const char needle[] = "abc";
const std::string haystack;
const int len = haystack.length();
if (len<sizeof(needle))
return false;
for (int i=0; i<sizeof(needle); i++)
if (needle[i] != haystack[len-sizeof(needle)+i])
return false;
return true;
notice that, as long as std::string::operator[] is O(1), the two listings have the same performance characteristics
This code probably is faster than the ones you are testing.
But you will only know if you do some testing.
bool EndsWith(const string &strValue, const string &strEnd)
{
int iDiference = strValue.size() - strEnd.size();
if(iDiference >= 0)
return (memcmp(strValue.c_str() + iDiference, strEnd.c_str(), strEnd.size()) == 0);
return false;
}

Speedier std::insert, or, How to optimize a call that Instruments says is slow?

I'm attempting to Xcode instruments to find ways to speed up my app enough to run well on legacy devices. Most of the time is spent in an audio callback, specifically:
void Analyzer::mergeWithOld(tones_t& tones) const {
tones.sort();
tones_t::iterator it = tones.begin();
// Iterate over old tones
for (tones_t::const_iterator oldit = m_tones.begin(); oldit != m_tones.end(); ++oldit) {
// Try to find a matching new tone
while (it != tones.end() && *it < *oldit) ++it;
// If match found
if (it != tones.end() && *it == *oldit) {
// Merge the old tone into the new tone
it->age = oldit->age + 1;
it->stabledb = 0.8 * oldit->stabledb + 0.2 * it->db;
it->freq = 0.5 * oldit->freq + 0.5 * it->freq;
} else if (oldit->db > -80.0) {
// Insert a decayed version of the old tone into new tones
Tone& t = *tones.insert(it, *oldit);
t.db -= 5.0;
t.stabledb -= 0.1;
}
}
}
I feel a bit like a dog who finally catches a squirrel and then realizes he has no idea what to do next. Can I speed this up, and if so, how do I go about doing it?
EDIT: Of course— tones_t is
typedef std::list<Tone> tones_t;
And Tone is a struct:
struct Tone {
static const std::size_t MAXHARM = 48; ///< The maximum number of harmonics tracked
static const std::size_t MINAGE = TONE_AGE_REQUIRED; // The minimum age required for a tone to be output
double freq; ///< Frequency (Hz)
double db; ///< Level (dB)
double stabledb; ///< Stable level, useful for graphics rendering
double harmonics[MAXHARM]; ///< Harmonics' levels
std::size_t age; ///< How many times the tone has been detected in row
double highestFreq;
double lowestFreq;
int function;
float timeStamp;
Tone();
void print() const; ///< Prints Tone to std::cout
bool operator==(double f) const; ///< Compare for rough frequency match
/// Less-than compare by levels (instead of frequencies like operator< does)
static bool dbCompare(Tone const& l, Tone const& r) {
return l.db < r.db;
}
};
Optimizations are a complex things. You may need to try several approaches.
1: Merge m_tones and tones into a new list then assign that list back to m_tones. Be sure to set the capacity for the new list beforehand.
This adds two list copies into the mix but eliminates all the inserts. You would have to test to see if it's faster.
2: Dump the list for a different structure. Can you store m_tones as a std::set instead of a list?
When you need to get m_tones in an ordered fashion you will need to call std::sort, but if you don't need an ordered iteration or if you only need it infrequently, then it might be faster.
These are just ideas for how to think about the problem differently, you will have to test test test to see which option has the best performance.

Alternatives to standard functions of C++ to get speed optimization

Just to clarify that I also think the title is a bit silly. We all know that most built-in functions of the language are really well written and fast (there are ones even written by assembly). Though may be there still are some advices for my situation. I have a small project which demonstrates the work of a search engine. In the indexing phase, I have a filter method to filter out unnecessary things from the keywords. It's here:
bool Indexer::filter(string &keyword)
{
// Remove all characters defined in isGarbage method
keyword.resize(std::remove_if(keyword.begin(), keyword.end(), isGarbage) - keyword.begin());
// Transform all characters to lower case
std::transform(keyword.begin(), keyword.end(), keyword.begin(), ::tolower);
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (keyword.size() == 0 || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}
At first sign, these functions (alls are member functions of STL container or standard function) are supposed to be fast and not take many time in the indexing phase. But after profiling with Valgrind, the inclusive cost of this filter is ridiculous high: 33.4%. There are three standard functions of this filter take most of the time for that percentage: std::remove_if takes 6.53%, std::set::find takes 15.07% and std::transform takes 7.71%.
So if there are any thing I can do (or change) to reduce the instruction times cost by this filter (like using parallellizing or something like that), please give me your advice. Thanks in advance.
UPDATE: Thanks for all your suggestion. So in brief, I've summarize what I need to do is:
1) Merge tolower and remove_if into one by construct my own loop.
2) Use unordered_set instead of set for faster find method.
Thus I've chosen Mark_B's as the right answer.
First, are you certain that optimization and inlining are enabled when you compile?
Assuming that's the case, I would first try writing my own transformer that combines removing garbage and lower-casing into one step to prevent iterating over the keyword that second time.
There's not a lot you can do about the find without using a different container such as unordered_set as suggested in a comment.
Is it possible for your application that doing the filtering really just is a really CPU-intensive part of the operation?
If you use a boost filter iterator you can merge the remove_if and transform into one, something like (untested):
keyword.erase(std::transform(boost::make_filter_iterator(!boost::bind(isGarbage), keyword.begin(), keyword.end()),
boost::make_filter_iterator(!boost::bind(isGarbage), keyword.end(), keyword.end()),
keyword.begin(),
::tolower), keyword.end());
This is assuming you want the side effect of modifying the string to still be visible externally, otherwise pass by const reference instead and just use count_if and a predicate to do all in one. You can build a hierarchical data structure (basically a tree) for the list of stop words that makes "in-place" matching possible, for example if your stop words are SELECT, SELECTION, SELECTED you might build a tree:
|- (other/empty accept)
\- S-E-L-E-C-T- (empty, fail)
|- (other, accept)
|- I-O-N (fail)
\- E-D (fail)
You can traverse a tree structure like that simultaneously whilst transforming and filtering without any modifications to the string itself. In reality you'd want to compact the multi-character runs into a single node in the tree (probably).
You can build such a data structure fairly trivially with something like:
#include <iostream>
#include <map>
#include <memory>
class keywords {
struct node {
node() : end(false) {}
std::map<char, std::unique_ptr<node>> children;
bool end;
} root;
void add(const std::string::const_iterator& stop, const std::string::const_iterator c, node& n) {
if (!n.children[*c])
n.children[*c] = std::unique_ptr<node>(new node);
if (stop == c+1) {
n.children[*c]->end = true;
return;
}
add(stop, c+1, *n.children[*c]);
}
public:
void add(const std::string& str) {
add(str.end(), str.begin(), root);
}
bool match(const std::string& str) const {
const node *current = &root;
std::string::size_type pos = 0;
while(current && pos < str.size()) {
const std::map<char,std::unique_ptr<node>>::const_iterator it = current->children.find(str[pos++]);
current = it != current->children.end() ? it->second.get() : nullptr;
}
if (!current) {
return false;
}
return current->end;
}
};
int main() {
keywords list;
list.add("SELECT");
list.add("SELECTION");
list.add("SELECTED");
std::cout << list.match("TEST") << std::endl;
std::cout << list.match("SELECT") << std::endl;
std::cout << list.match("SELECTOR") << std::endl;
std::cout << list.match("SELECTED") << std::endl;
std::cout << list.match("SELECTION") << std::endl;
}
This worked as you'd hope and gave:
0
1
0
1
1
Which then just needs to have match() modified to call the transformation and filtering functions appropriately e.g.:
const char c = str[pos++];
if (filter(c)) {
const std::map<char,std::unique_ptr<node>>::const_iterator it = current->children.find(transform(c));
}
You can optimise this a bit (compact long single string runs) and make it more generic, but it shows how doing everything in-place in one pass might be achieved and that's the most likely candidate for speeding up the function you showed.
(Benchmark changes of course)
If a call to isGarbage() does not require synchronization, then parallelization should be the first optimization to consider (given of course that filtering one keyword is a big enough task, otherwise parallelization should be done one level higher). Here's how it could be done - in one pass through the original data, multi-threaded using Threading Building Blocks:
bool isGarbage(char c) {
return c == 'a';
}
struct RemoveGarbageAndLowerCase {
std::string result;
const std::string& keyword;
RemoveGarbageAndLowerCase(const std::string& keyword_) : keyword(keyword_) {}
RemoveGarbageAndLowerCase(RemoveGarbageAndLowerCase& r, tbb::split) : keyword(r.keyword) {}
void operator()(const tbb::blocked_range<size_t> &r) {
for(size_t i = r.begin(); i != r.end(); ++i) {
if(!isGarbage(keyword[i])) {
result.push_back(tolower(keyword[i]));
}
}
}
void join(RemoveGarbageAndLowerCase &rhs) {
result.insert(result.end(), rhs.result.begin(), rhs.result.end());
}
};
void filter_garbage(std::string &keyword) {
RemoveGarbageAndLowerCase res(keyword);
tbb::parallel_reduce(tbb::blocked_range<size_t>(0, keyword.size()), res);
keyword = res.result;
}
int main() {
std::string keyword = "ThIas_iS:saome-aTYpe_Ofa=MoDElaKEYwoRDastrang";
filter_garbage(keyword);
std::cout << keyword << std::endl;
return 0;
}
Of course, the final code could be improved further by avoiding data copying, but the goal of the sample is to demonstrate that it's an easily threadable problem.
You might make this faster by making a single pass through the string, ignoring the garbage characters. Something like this (pseudo-code):
std::string normalizedKeyword;
normalizedKeyword.reserve(keyword.size())
for (auto p = keyword.begin(); p != keyword.end(); ++p)
{
char ch = *p;
if (!isGarbage(ch))
normalizedKeyword.append(tolower(ch));
}
// then search for normalizedKeyword in stopwords
This should eliminate the overhead of std::remove_if, although there is a memory allocation and some new overhead of copying characters to normalizedKeyword.
The problem here isn't the standard functions, it's your use of them. You are making multiple passes over your string when you obviously need to be doing only one.
What you need to do probably can't be done with the algorithms straight up, you'll need help from boost or rolling your own.
You should also carefully consider whether resizing the string is actually necessary. Yeah, you might save some space but it's going to cost you in speed. Removing this alone might account for quite a bit of your operation's expense.
Here's a way to combine the garbage removal and lower-casing into a single step. It won't work for multi-byte encoding such as UTF-8, but neither did your original code. I assume 0 and 1 are both garbage values.
bool Indexer::filter(string &keyword)
{
static char replacements[256] = {1}; // initialize with an invalid char
if (replacements[0] == 1)
{
for (int i = 0; i < 256; ++i)
replacements[i] = isGarbage(i) ? 0 : ::tolower(i);
}
string::iterator tail = keyword.begin();
for (string::iterator it = keyword.begin(); it != keyword.end(); ++it)
{
unsigned int index = (unsigned int) *it & 0xff;
if (replacements[index])
*tail++ = replacements[index];
}
keyword.resize(tail - keyword.begin());
    // After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
    if (keyword.size() == 0 || stopwords_.find(keyword) != stopwords_.end())
        return false;
    return true;
}
The largest part of your timing is the std::set::find so I'd also try std::unordered_set to see if it improves things.
I would implement it with lower level C functions, something like this maybe (not checking this compiles), doing the replacement in place and not resizing the keyword.
Instead of using a set for garbage characters, I'd add a static table of all 256 characters (yeah, it will work for ascii only), with 0 for all characters that are ok, and 1 for those who should be filtered out. something like:
static const char GARBAGE[256] = { 1, 1, 1, 1, 1, ...., 0, 0, 0, 0, 1, 1, ... };
then for each character in offset pos in const char *str you can just check if (GARBAGE[str[pos]] == 1);
this is more or less what an unordered set does, but will have much less instructions. stopwords should be an unordered set if they're not.
now the filtering function (I'm assuming ascii/utf8 and null terminated strings here):
bool Indexer::filter(char *keyword)
{
char *head = pos;
char *tail = pos;
while (*head != '\0') {
//copy non garbage chars from head to tail, lowercasing them while at it
if (!GARBAGE[*head]) {
*tail = tolower(*head);
++tail; //we only advance tail if no garbag
}
//head always advances
++head;
}
*tail = '\0';
// After filtering, if the keyword is empty or it is contained in stop words list, mark as invalid keyword
if (tail == keyword || stopwords_.find(keyword) != stopwords_.end())
return false;
return true;
}