I have this old batch system. The scheduler stores all computational nodes in one big array. Now that's OK for the most part, because most queries can be solved by filtering for nodes that satisfy the query.
The problem I have now is that apart from some basic properties (number of cpus, memory, OS), there are also these weird grouping properties (city, infiniband, network scratch).
Now the issue with these is that when a user requests nodes with infiniband I can't just give him any nodes, but I have to give him nodes connected to one infiniband switch, so the nodes can actually communicate using infiniband.
This is still OK, when user only requests one such property (I can just partition the array for each of the properties and then try to select the nodes in each partition separately).
The problem comes with combining multiple such properties, because then I would have to generate all combination of the subsets (partitions of the main array).
The good thing is that most of the properties are in a sub-set or equivalence relation (It sort of makes sense for machines on one infiniband switch to be in one city). But this unfortunately isn't strictly true.
Is there some good data structure for storing this kind of semi-hierarchical mostly-tree-like thing?
EDIT: example
node1 : city=city1, infiniband=switch03, networkfs=server01
node2 : city=city1, infiniband=switch03, networkfs=server01
node3 : city=city1, infiniband=switch03
node4 : city=city1, infiniband=switch03
node5 : city=city2, infiniband=switch03, networkfs=server02
node6 : city=city2, infiniband=switch03, networkfs=server02
node7 : city=city2, infiniband=switch04, networkfs=server02
node8 : city=city2, infiniband=switch04, networkfs=server02
Users request:
2x node with infiniband and networkfs
The desired output would be: (node1, node2) or (node5,node6) or (node7,node8).
In a good situation this example wouldn't happen, but we actually have these weird cross-site connections in some cases. If the nodes in city2 would be all on infiniband switch04, it would be easy. Unfortunately now I have to generate groups of nodes, that have the same infiniband switch and same network filesystem.
In reality the problem is much more complicated, since users don't request entire nodes, and the properties are many.
Edit: added the desired output for the query.
Assuming you have p grouping properties and n machines, a bucket-based solution is the easiest to set up and provides O(2p·log(n)) access and updates.
You create a bucket-heap for every group of properties (so you would have a bucket-heap for "infiniband", a bucket-heap for "networkfs" and a bucket-heap for "infiniband × networkfs") — this means 2p bucket-heaps.
Each bucket-heap contains a bucket for every combination of values (so the "infiniband" bucket would contain a bucket for key "switch04" and one for key "switch03") — this means a total of at most n·2p buckets split across all bucket-heaps.
Each bucket is a list of servers (possibly partitioned into available and unavailable). The bucket-heap is a standard heap (see std::make_heap) where the value of each bucket is the number of available servers in that bucket.
Each server stores references to all buckets that contain it.
When you look for servers that match a certain group of properties, you just look in the corresponding bucket for that property group, and climb down the heap looking for a bucket that's large enough to accomodate the number of servers requested. This takes O(log(p)·log(n)).
When servers are marked as available or unavailable, you have to update all buckets containing those servers, and then update the bucket-heaps containing those buckets. This is an O(2p·log(n)) operation.
If you find yourself having too many properties (and the 2p grows out of control), the algorithm allows for some bucket-heaps to be built on-demand from other bucket-heaps : if the user requests "infiniband × networkfs" but you only have a bucket-heap available for "infiniband" or "networkfs", you can turn each bucket in the "infiniband" bucket-heap into a bucket-heap on its own (use a lazy algorithm so you don't have to process all buckets if the first one works) and use a lazy heap-merging algorithm to find an appropriate bucket. You can then use a LRU cache to decide which property groups are stored and which are built on-demand.
My guess is that there won't be an "easy, efficient" algorithm and data structure to solve this problem, because what you're doing is akin to solving a set of simultaneous equations. Suppose there are 10 categories (like city, infiniband and network) in total, and the user specifies required values for 3 of them. The user asks for 5 nodes, let's say. Your task is then to infer values for the remaining 7 categories, such that at least 5 records exist that have all 10 category fields equal to these values (the 3 specified and the 7 inferred). There may be multiple solutions.
Still, provided there aren't too many different categories, and not too many distinct possibilities within each category, you can do a simple brute force recursive search to find possible solutions, where at each level of recursion you consider a particular category, and "try" each possibility for it. Suppose the user asks for k records, and may choose to stipulate any number of requirements via required_city, required_infiniband, etc.:
either(x, y) := if defined(x) then [x] else y
For each city c in either(required_city, [city1, city2]):
For each infiniband i in either(required_infiniband, [switch03, switch04]):
For each networkfs nfs in either(required_nfs, [undefined, server01, server02]):
Do at least k records of type [c, i, nfs] exist? If so, return them.
The either() function is just a way of limiting the search to the subspace containing points that the user gave constraints for.
Based on this, you will need a way to quickly look up the number of points (rows) for any given [c, i, nfs] combination -- nested hashtables will work just fine for this.
Step 1: Create an index for each property. E.g. for each property+value pair, create a sorted list of nodes with that property. Put each such list into an associative array of some kind- That is something like and stl map, one for each property, indexed by values. Such that when you are done you have a near constant time function that can return to you a list of nodes that match a single property+value pair. The list is simply sorted by node number.
Step 2: Given a query, for each property+value pair required, retrieve the list of nodes.
Step 3: Starting with the shortest list, call it list 0, compare it to each of the other lists in turn removing elements from list 0 that are not in the other lists.
You should now have just the nodes that have all the properties requested.
Your other option would be to use a database, it is already set up to support queries like this. It can be done all in memory with something like BerkeleyDB with the SQL extensions.
If sorting the list by every criteria mentioned in the query is viable (or having the list pre-sorted by each relative criteria), this works very well.
By "relative criteria", I mean criteria not of the form "x must be 5", which are trivial to filter against, but criteria of the form "x must be the same for each item in the result set". If there are also criteria of the "x must be 5" form, then filter against those first, then do the following.
It relies on using a stable sort on multiple columns to find the matching groups quickly (without trying out combinations).
The complexity is number of nodes * number of criteria in the query (for the algorithm itself) + number of nodes * log(number of nodes) * number of criteria (for the sort, if not pre-sorting). So Nodes*Log(Nodes)*Criteria.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace bleh
{
class Program
{
static void Main(string[] args)
{
List<Node> list = new List<Node>();
// create a random input list
Random r = new Random();
for (int i = 1; i <= 10000; i++)
{
Node node = new Node();
for (char c = 'a'; c <= 'z'; c++) node.Properties[c.ToString()] = (r.Next() % 10 + 1).ToString();
list.Add(node);
}
// if you have any absolute criteria, filter the list first according to it, which is very easy
// i am sure you know how to do that
// only look at relative criteria after removing nodes which are eliminated by absolute criteria
// example
List<string> criteria = new List<string> {"c", "h", "r", "x" };
criteria = criteria.OrderBy(x => x).ToList();
// order the list by each relative criteria, using a ***STABLE*** sort
foreach (string s in criteria)
list = list.OrderBy(x => x.Properties[s]).ToList();
// size of sought group
int n = 4;
// this is the algorithm
int sectionstart = 0;
int sectionend = 0;
for (int i = 1; i < list.Count; i++)
{
bool same = true;
foreach (string s in criteria) if (list[i].Properties[s] != list[sectionstart].Properties[s]) same = false;
if (same == true) sectionend = i;
else sectionstart = i;
if (sectionend - sectionstart == n - 1) break;
}
// print the results
Console.WriteLine("\r\nResult:");
for (int i = sectionstart; i <= sectionend; i++)
{
Console.Write("[" + i.ToString() + "]" + "\t");
foreach (string s in criteria) Console.Write(list[i].Properties[s] + "\t");
Console.WriteLine();
}
Console.ReadLine();
}
}
}
I would do something like this (obviously instead of strings you should map them to int, and use int's as codes)
struct structNode
{
std::set<std::string> sMachines;
std::map<std::string, int> mCodeToIndex;
std::vector<structNode> vChilds;
};
void Fill(std::string strIdMachine, int iIndex, structNode* pNode, std::vector<std::string> &vCodes)
{
if(iIndex < vCodes.size())
{
// Add "Empty" if Needed
if(pNode->vChilds.size() == 0)
{
pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair("empty", 0));
pNode->vChilds.push_back(structNode());
}
// Add for "Empty"
pNode->vChilds[0].sMachines.insert(strIdMachine);
Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[0], vCodes );
if(vCodes[iIndex] == "empty")
return;
// Add for "Any"
std::map<std::string, int>::iterator mIte = pNode->mCodeToIndex.find("any");
if(mIte == pNode->mCodeToIndex.end())
{
mIte = pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair("any", pNode->vChilds.size()));
pNode->vChilds.push_back(structNode());
}
pNode->vChilds[mIte->second].sMachines.insert(strIdMachine);
Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[mIte->second], vCodes );
// Add for "Segment"
mIte = pNode->mCodeToIndex.find(vCodes[iIndex]);
if(mIte == pNode->mCodeToIndex.end())
{
mIte = pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair(vCodes[iIndex], pNode->vChilds.size()));
pNode->vChilds.push_back(structNode());
}
pNode->vChilds[mIte->second].sMachines.insert(strIdMachine);
Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[mIte->second], vCodes );
}
}
//////////////////////////////////////////////////////////////////////
// Get
//
// NULL on empty group
//////////////////////////////////////////////////////////////////////
set<std::string>* Get(structNode* pNode, int iIndex, vector<std::string> vCodes, int iMinValue)
{
if(iIndex < vCodes.size())
{
std::map<std::string, int>::iterator mIte = pNode->mCodeToIndex.find(vCodes[iIndex]);
if(mIte != pNode->mCodeToIndex.end())
{
if(pNode->vChilds[mIte->second].sMachines.size() < iMinValue)
return NULL;
else
return Get(&pNode->vChilds[mIte->second], (iIndex + 1), vCodes, iMinValue);
}
else
return NULL;
}
return &pNode->sMachines;
}
To fill the tree with your sample
structNode stRoot;
const char* dummy[] = { "city1", "switch03", "server01" };
const char* dummy2[] = { "city1", "switch03", "empty" };
const char* dummy3[] = { "city2", "switch03", "server02" };
const char* dummy4[] = { "city2", "switch04", "server02" };
// Fill the tree with the sample
Fill("node1", 0, &stRoot, vector<std::string>(dummy, dummy + 3));
Fill("node2", 0, &stRoot, vector<std::string>(dummy, dummy + 3));
Fill("node3", 0, &stRoot, vector<std::string>(dummy2, dummy2 + 3));
Fill("node4", 0, &stRoot, vector<std::string>(dummy2, dummy2 + 3));
Fill("node5", 0, &stRoot, vector<std::string>(dummy3, dummy3 + 3));
Fill("node6", 0, &stRoot, vector<std::string>(dummy3, dummy3 + 3));
Fill("node7", 0, &stRoot, vector<std::string>(dummy4, dummy4 + 3));
Fill("node8", 0, &stRoot, vector<std::string>(dummy4, dummy4 + 3));
Now you can easily obtain all the combinations that you want for example you query would be something like this:
vector<std::string> vCodes;
vCodes.push_back("empty"); // Discard first property (cities)
vCodes.push_back("any"); // Any value for infiniband
vCodes.push_back("any"); // Any value for networkfs (except empty)
set<std::string>* pMachines = Get(&stRoot, 0, vCodes, 2);
And for example only City02 on switch03 with networfs not empty
vector<std::string> vCodes;
vCodes.push_back("city2"); // Only city2
vCodes.push_back("switch03"); // Only switch03
vCodes.push_back("any"); // Any value for networkfs (except empy)
set<std::string>* pMachines = Get(&stRoot, 0, vCodes, 2);
Related
I would like to create an unordered_map where the key is a vector / array of shared_ptr.
Currently, I use a string-key that is created from pointed objects' string identifiers and it seems slow.
Is there a way to use either a vector of shared_ptr or a hash of such a vector as key?
Reading is more important than writing, as I have roughly 18 times more reads than writes. (current benchmark: 7 million writes to dict, 130 million reads)
As I'm new to C++, and especially to performance / memory and optimization, I attach the full info of what I'm doing below including rough pseudo-code.
Problem setting:
In my code I am creating the product of a vector of graphs.
This means, that the resulting graph G = G1 * G2 * G3 * ... is contains all combinations of nodes of the operands and the transitions between each node.
Thus, what I do is that I create all combinations of nodes from the operands and collect them in a vector of vectors. These are then combined to a vector of composite nodes.
Subsequently I create the ("unsynchronized") edges of the graph like so:
G1 : (n11) –-> (n12)
G2 : (n21) --> (n22)
G = G1 * G2
G:
(n11,n21) --> (n12,n21)
| |
| |
V V
(n11,n22) --> (n12,n22)
Note: There is no "synchronized" edge from (n11,n21) to (n12,n22)
Magnitude & Problem
I'm combining some 10 graphs with up to 10 nodes each. My current benchmark creates 7'000'000 nodes and >130'000'000 edges (requiring > 100GB memory).
I am already happy with what I got. However, I used Inspector (on Mac Big Sur) to check if I have any significant performance issues and saw that in my create_product routine roughly 20% of the time is spent in the unordered_map's .at(...) routine.
I take that string-keys are low-performing and ideally I would replace them with the vector / array of pointers or a hash thereof.
Rough code example:
// info: each node has a name field
vector<vector<shared_ptr<Node>>> node_combinations = get_combinations(); /* function to create the combinations */
// info: name of composite is a comma-separated join of the subnode-names
e.g. name_n1,name_n2,name_n3
// info: CompositeNodes store the vector<shared_ptr<Node>> of the nodes they were created from
vector<shared_ptr<CompositeNode>> composites = get_composites(combinations); /* turns each "inside" vector into a composite node */
// create a map to find the composite node by its name
// this is actually created in the get_combinations() routine right away...
boost::unordered_map<string, shared_ptr<CompositeNode>> name_to_node;
for(auto &comp : composites) {
name_to_node[comp->name] = comp;
}
// create edges ... improve below here
for(const auto &compnode : composed_states) {
for(const auto &subnode : compnode->subnodes) {
for(const auto &edge : subnode->edges) {
// get the name of target
string target_name;
for(auto &subnode_target : compnode->subnodes) {
if(substate == subtargetstate) { // if it's the current node, then replace with the target
{
target_name.append(edge.target->name);
}
else // otherwise the the name
{
target_name.append(subnode_target->name);
}
target_name.append(",");
}
shared_ptr<CompositeNode> target = name_to_node.at(target_name);
/* code to create the edge from subnode to target */
}
}
}
I am holding a very big list of memory addresses (around 400.000) and need to check if a certain address already exists in it 400.000 times a second.
A code example to illustrate my setup:
std::set<uintptr_t> existingAddresses; // this one contains 400.000 entries
while (true) {
// a new list with possible new addresses
std::set<uintptr_t> newAddresses; // also contains about ~400.000 entries
// in my own code, these represent a new address list
for (auto newAddress : newAddresses) {
// already processed this address, skip it
if (existingAddresses.find(newAddress) != existingAddresses.end()) {
continue;
}
// we didn't have this address yet, so process it.
SomeHeavyTask(newAddress);
// so we don't process it again
existingAddresses.emplace(newAddress);
}
Sleep(1000);
}
This is the first implementation I came up with and I think it can be greatly improved.
Next I came up with using some custom indexing strategy, also used in databases. The idea is to take a part of the value and use that to index it in its own group set. If I would take for example the last two numbers of the address I would have 16^2 = 256 groups to put addresses in.
So I would end up with a map like this:
[FF] -> all address ending with `FF`
[EF] -> all addresses ending with `EF`
[00] -> all addresses ending with `00`
// etc...
With this I will only need to do a lookup on ~360 entries in the corresponding set. Resulting in ~360 lookups being done 400.000 times a second. Much better!
I am wondering if there are any other tricks or better ways to do this? My goal is to make this address lookup as FAST as possible.
std::set<uintptr_t> uses a balanced tree, so look-up time is O(log N).
std::unordered_set<uintptr_t>, on the other hand, is hash-based, with lookup time of O(1).
Although this is only an asymptotic complexity measure, meaning that there is no guaranteed improvement due to constant factors involved, the difference may prove significant when the collection contains 400,000 elements.
You may use algorithm similar to merge:
std::set<uintptr_t> existingAddresses; // this one contains 400.000 entries
while (true) {
// a new list with possible new addresses
std::set<uintptr_t> newAddresses; // also contains about ~400.000 entries
auto existing_it = existingAddresses.begin();
auto new_it = newAddresses.begin();
while (new_it != newAddresses.end() && existing_it != existingAddresses.end()) {
if (*new_it < *existing_it) {
// we didn't have this address yet, so process it.
SomeHeavyTask(*new_it);
// so we don't process it again
existingAddresses.insert(existing_it, *new_it);
++new_it;
} else if (*existing_it < *new_it) {
++existing_it;
} else { // Both equal
++existing_it;
++new_it;
}
}
for (new_it != newAddresses.end())
// we didn't have this address yet, so process it.
SomeHeavyTask(*new_it);
// so we don't process it again
existingAddresses.insert(existingAddresses.end(), *new_it);
++new_it;
}
Sleep(1000);
}
Complexity is now linear: O(N + M) instead of O(N log M) (with N number of new addresses, and M count for old ones).
I have a global unique path table which can be thought of as a directed un-weighted graph. Each node represents either a piece of physical hardware which is being controlled, or a unique location in the system. The table contains the following for each node:
A unique path ID (int)
Type of component (char - 'A' or 'L')
String which contains a comma separated list of path ID's which that node is connected to (char[])
I need to create a function which given a starting and ending node, finds the shortest path between the two nodes. Normally this is a pretty simple problem, but here is the issue I am having. I have a very limited amount of memory/resources, so I cannot use any dynamic memory allocation (ie a queue/linked list). It would also be nice if it wasn't recursive (but it wouldn't be too big of an issue if it was as the table/graph itself if really small. Currently it has 26 nodes, 8 of which will never be hit. At worst case there would be about 40 nodes total).
I started putting something together, but it doesn't always find the shortest path. The pseudo code is below:
bool shortestPath(int start, int end)
if start == end
if pathTable[start].nodeType == 'A'
Turn on part
end if
return true
else
mark the current node
bool val
for each node in connectedNodes
if node is not marked
val = shortestPath(node.PathID, end)
end if
end for
if val == true
if pathTable[start].nodeType == 'A'
turn on part
end if
return true
end if
end if
return false
end function
Anyone have any ideas how to either fix this code, or know something else that I could use to make it work?
----------------- EDIT -----------------
Taking Aasmund's advice, I looked into implementing a Breadth First Search. Below I have some c# code which I quickly threw together using some pseudo code I found online.
pseudo code found online:
Input: A graph G and a root v of G
procedure BFS(G,v):
create a queue Q
enqueue v onto Q
mark v
while Q is not empty:
t ← Q.dequeue()
if t is what we are looking for:
return t
for all edges e in G.adjacentEdges(t) do
u ← G.adjacentVertex(t,e)
if u is not marked:
mark u
enqueue u onto Q
return none
C# code which I wrote using this code:
public static bool newCheckPath(int source, int dest)
{
Queue<PathRecord> Q = new Queue<PathRecord>();
Q.Enqueue(pathTable[source]);
pathTable[source].markVisited();
while (Q.Count != 0)
{
PathRecord t = Q.Dequeue();
if (t.pathID == pathTable[dest].pathID)
{
return true;
}
else
{
string connectedPaths = pathTable[t.pathID].connectedPathID;
for (int x = 0; x < connectedPaths.Length && connectedPaths != "00"; x = x + 3)
{
int nextNode = Convert.ToInt32(connectedPaths.Substring(x, 2));
PathRecord u = pathTable[nextNode];
if (!u.wasVisited())
{
u.markVisited();
Q.Enqueue(u);
}
}
}
}
return false;
}
This code runs just fine, however, it only tells me if a path exists. That doesn't really work for me. Ideally what I would like to do is in the block "if (t.pathID == pathTable[dest].pathID)" I would like to have either a list or a way to see what nodes I had to pass through to get from the source and destination, such that I can process those nodes there, rather than returning a list to process elsewhere. Any ideas on how i could make that change?
The most effective solution, if you're willing to use static memory allocation (or automatic, as I seem to recall that the C++ term is), is to declare a fixed-size int array (of size 41, if you're absolutely certain that the number of nodes will never exceed 40). By using two indices to indicate the start and end of the queue, you can use this array as a ring buffer, which can act as the queue in a breadth-first search.
Alternatively: Since the number of nodes is so small, Bellman-Ford may be fast enough. The algorithm is simple to implement, does not use recursion, and the required extra memory is only a distance (int, or even byte in your case) and a predecessor id (int) per node. The running time is O(VE), alternatively O(V^3), where V is the number of nodes and E is the number of edges.
The array of objects tArray contains buyer names and the numshares of there purchases, each buyer can be in the array of objects more than once. I have to return in an array the names of the five largest buyers.
I attempted to run two arrays in parallel with the buyer name and there total volume in another array.
my method in general flawed as i am getting wrong results, how can I solve this problem.
Thanks
ntransactions = the number of transactions in the array
string* Analyser::topFiveBuyers()
{
//set size and add buyer names for comparison.
const int sSize = 5;
string *calcString = new string[sSize];
calcString[0] = tArray[0].buyerName;
calcString[1] = tArray[1].buyerName;
calcString[2] = tArray[2].buyerName;
calcString[3] = tArray[3].buyerName;
calcString[4] = tArray[4].buyerName;
int calcTotal[sSize] = {INT_MIN, INT_MIN, INT_MIN, INT_MIN, INT_MIN};
//checks transactions
for (int i = 0; i<nTransactions; i++)
{
//compares with arrays
for(int j =0; j<sSize; j++)
{
//checks if the same buyer and then increase his total
if(tArray[i].buyerName == calcString[j])
{
calcTotal[j] += tArray[i].numShares;
break;
}
//checks if shares is great then current total then replaces
if(tArray[i].numShares > calcTotal[j])
{
calcTotal[j] = tArray[i].numShares;
calcString[j] = tArray[i].buyerName;
break;
}
}
}
return calcString;
}
Assuming you're allowed to, I'd start by accumulating the values into an std::map:
std::map<std::string, int> totals;
for (int i=0; i<ntransactions; i++)
totals[tarray[i].buyername] += tarray[i].numshares;
This will add up the total number of shares for each buyer. Then you want to copy that data to an std::vector, and get the top 5 by number of shares. For the moment, I'm going to assume your struct (with buyername and numshares as members) is named transaction.
std::vector<transaction> top5;
std::copy(totals.begin(), totals.end(), std::back_inserter(top5));
std::nth_element(top5.begin(), top5.begin()+5, top5.end(), by_shares());
For this to work, you'll need a comparison functor named by_shares that looks something like:
struct by_shares {
bool operator()(transaction const &a, transaction const &b) {
return b.numshares < a.numshares;
}
};
Or, if you're using a compiler new enough to support it, you could use a lambda instead of an explicit functor for the comparison:
std::nth_element(totals.begin(), totals.end()-5, totals.end(),
[](transaction const &a, transaction const &b) {
return b.numshares < a.numshares;
});
Either way, after nth_element completes, your top 5 will be in the first 5 elements of the vector. I've reversed the normal comparison to do this, so it's basically working in descending order. Alternatively, you could use ascending order, but specify the spot 5 from the end of the collection instead of 5 from the beginning.
I should add that there are other ways to do this -- for example, a Boost bimap would do the job pretty nicely as well. Given that this sounds like homework, my guess is that a pre-packaged solution like bimap that handles virtually the entire job for you probably would't/won't be allowed (and even std::map may be prohibited for pretty much the same reason).
As you can have several times the same buyer, you must store a counter for all buyers, not only for 5 of them as there is no way to know that a buyer you remove from the top 5 should not be part of this top 5 (as more items could be linked to this buyer later in tArray).
I would suggest to use a stl map with key being buyer name and value the number of items. You fill it by iterating on tArray and sum all items bought by the same buyer.
Then you can iterate on the map and retrieve the 5 top buyers easily as you have only one entry per buyer.
When the outer loop start, the index i is zero, and the same for the inner loop. This means that the first condition checks tArray[0].buyerName == calcString[0] which is equal as you set it that way before the loops. This leads to calcTotal[0] is increased from -2147483648 and leaving the inner loop.
I'm not certain, but this doesn't seem like something one would want.
I finally determined that this function is responsible for the majority of my bottleneck issues. I think its because of the massively excessive random access that happens when most of the synapses are already active. Basically, as the title says, I need to somehow optimize the algorithm so that I'm not randomly checking a ton of active elements before landing on one of the few that are left.
Also, I included the whole function in case of other flaws that can be spotted.
void NetClass::Explore(vector <synapse> & synapses, int & n_syns) //add new synapses
{
int size = synapses.size();
assert(n_syns <= size );
//Increase the age of each active synapse by 1
Age_Increment(synapses);
//make sure there is at least one inactive vector left
if(n_syns == size)
return;
//stochastically decide whether a new connection is added
if((rand_r(seedp) %1000) < ( x / (1 +(n_syns * ( y / 100)))))
{
n_syns++; //a new synapse has been created
//main inefficiency here
while(1)
{
int syn = rand_r(seedp) % (size);
if (!synapses[syn].active)
{
synapses[syn].active = true;
synapses[syn].weight = .04 + (float (rand_r(seedp) % 17) / 100);
break;
}
}
}
}
void NetClass::Age_Increment(vector <synapse> & synapses)
{
for(int q=0, int size = synapses.size(); q < size; q++)
if(synapses[q].active)
synapses[q].age++;
}
Pass a random number, k, in the range [0, size-n_syns) to Age_Increment. Have Age_Increment return the kth empty slot.
Since you're already traversing the whole list in Age_Increment, update that function to return the list of the indexes of inactive synapses.
You can then pick a random item from that list directly.
This is similar to the problem of finding free blocks in memory management, so I would take a look at algorithms used in that domain, specifically free lists, which is a list of free positions. (These are usually implemented as linked lists to be able to pop elements off an end efficiently. Random access in a linked list would still be O(n) - with a smaller n, but still not the best choice for your use case.)