Simulation design - flow of data, coupling - c++

I am writing a simulation and need some hint on the design. The basic idea is that data for the given stochastic processes is being generated and later on consumed for various calculations. For example for 1 iteration:
Process 1 -> generates data for source 1: x1
Process 2 -> generates data for source 1: x2
and so on
Later I want to apply some transformations for example on the output of source 2, which results in x2a, x2b, x2c. So in the end up with the following vector: [x1, x2a, x2b, x2c].
I have a problem, as for N-multivariate stochastic processes (representing for example multiple correlated phenomenons) I have to generate N dimensional sample at once:
Process 1 -> generates data for source 1...N: x1...xN
I am thinking about the simple architecture that would allow to structuralize the simulation code and provide flexibility without hindering the performance.
I was thinking of something along these lines (pseudocode):
class random_process
{
// concrete processes would generate and store last data
virtual data_ptr operator()() const = 0;
};
class source_proxy
{
container_type<process> processes;
container_type<data_ptr> data; // pointers to the process data storage
data operator[](size_type number) const { return *(data[number]);}
void next() const {/* update the processes */}
};
Somehow I am not convinced about this design. For example, if I'd like to work with vectors of samples instead of single iteration, then above design should be changed (I could for example have the processes to fill the submatrices of the proxy-matrix passed to them with data, but again not sure if this is a good idea - if yes then it would also fit nicely the single iteration case). Any comments, suggestions and criticism are welcome.
EDIT:
Short summary of the text above to summarize the key points and clarify the situation:
random_processes contain the logic to generate some data. For example it can draw samples from multivariate random gaussian with the given means and correlation matrix. I can use for example Cholesky decomposition - and as a result I'll be getting a set of samples [x1 x2 ... xN]
I can have multiple random_processes, with different dimensionality and parameters
I want to do some transformations on individual elements generated by random_processes
Here is the dataflow diagram
random_processes output
x1 --------------------------> x1
----> x2a
p1 x2 ------------transform|----> x2b
----> x2c
x3 --------------------------> x3
p2 y1 ------------transform|----> y1a
----> y1b
The output is being used to do some calculations.

When I read this "the answer" doesn't materialize in my mind, but instead a question:
(This problem is part of a class of problems that various tool vendors in the market have created configurable solutions for.)
Do you "have to" write this or can you invest in tried and proven technology to make your life easier?
In my job at Microsoft I work with high performance computing vendors - several of which have math libraries. Folks at these companies would come much closer to understanding the question than I do. :)
Cheers,
Greg Oliver [MSFT]

I'll take a stab at this, perhaps I'm missing something but it sounds like we have a list of processes 1...N that don't take any arguments and return a data_ptr. So why not store them in a vector (or array) if the number is known at compile time... and then structure them in whatever way makes sense. You can get really far with the stl and the built in containers (std::vector) function objects(std::tr1::function) and algorithms (std::transform)... you didn't say much about the higher level structure so I'm assuming a really silly naive one, but clearly you would build the data flow appropriately. It gets even easier if you have a compiler with support for C++0x lambdas because you can nest the transformations easier.
//compiled in the SO textbox...
#include <vector>
#include <functional>
#include <numerics>
typedef int data_ptr;
class Generator{
public:
data_ptr operator()(){
//randomly generate input
return 42 * 4;
}
};
class StochasticTransformation{
public:
data_ptr operator()(data_ptr in){
//apply a randomly seeded function
return in * 4;
}
};
public:
data_ptr operator()(){
return 42;
}
};
int main(){
//array of processes, wrap this in a class if you like but it sounds
//like there is a distinction between generators that create data
//and transformations
std::vector<std::tr1::function<data_ptr(void)> generators;
//TODO: fill up the process vector with functors...
generators.push_back(Generator());
//transformations look like this (right?)
std::vector<std::tr1::function<data_ptr(data_ptr)> transformations;
//so let's add one
transformations.push_back(StochasticTransformation);
//and we have an array of results...
std::vector<data_ptr> results;
//and we need some inputs
for (int i = 0; i < NUMBER; ++i)
results.push_back(generators[0]());
//and now start transforming them using transform...
//pick a random one or do them all...
std::transform(results.begin(),results.end(),
results.begin(),results.end(),transformation[0]);
};

I think that the second option (the one mentioned in the last paragraph) makes more sense. In the one you had presented you are playing with pointers and indirect access to random process data. The other one would store all the data (either vector or a matrix) in one place - the source_proxy object. The random processes objects are then called with a submatrix to populate as a parameter, and themselves they do not store any data. The proxy manages everything - from providing the source data (for any distinct source) to requesting new data from the generators.
So changing a bit your snippet we could end up with something like this:
class random_process
{
// concrete processes would generate and store last data
virtual void operator()(submatrix &) = 0;
};
class source_proxy
{
container_type<random_process> processes;
matrix data;
data operator[](size_type source_number) const { return a column of data}
void next() {/* get new data from the random processes */}
};
But I agree with the other comment (Greg) that it is a difficult problem, and depending on the final application may require heavy thinking. It's easy to go into the dead-end resulting in rewriting lots of code...

Related

How to perform GroupBy Sum query on a list?

Background
I have worked with C#.Net + LINQ wherever possible and trying my hand at C++ development for a project I am involved. Of course, I fully realize that C# and C++ are two different worlds.
Question
I have an std::list<T> where T is a struct as follows:
struct SomeStruct{
int id;
int rate;
int value;
};
I need to get a result of group by rate and sum of value. How can I perform GroupBy Sum aggregate function on this list?
Example:
SomeStruct s1;
SomeStruct s2;
SomeStruct s3;
s1.id=1;
s1.rate=5;
s1.value=100;
s2.id=2;
s2.rate=10;
s2.value=50;
s3.id=3;
s3.rate=10;
s3.value=200;
std::list<SomeStruct> myList;
myList.push_front(s1);
myList.push_front(s2);
myList.push_front(s3);
With these inputs I would like to get following output:
rate|value
----|-----
5| 100
10| 250
I found a few promising libs such as CINQ and cppitertools. But I couldn't fully understand as I lack sufficient knowledge. It would be great if someone guide me to right direction, I am more than willing to learn new things.
Computing a Group-By sum is relatively straightforward:
using sum_type = int; // but maybe you want a larger type
auto num_groups = max_rate + 1;
std::vector<sum_type> rate_sums(num_groups); // this is initialized to 0
for(const auto& s : myList) {
rate_sums[s.rate] += s.value;
}
this is when the rate values are within 0 and max_rate, and max_rate is not too large relative to myList.size(); otherwise the memory use might be excessive (and you'll have some overhead initializing the vector).
If the rate values are scattered over a large range relative to myList.size(), consider using an std::unoredered_map instead of an std::vector).
The code above can also be parallelized. The way to parallelize it depends on your hardware, and there are all sorts of libraries to help you do this. In C++20 there might be language facilities for parallelization.
Remember, though, that linked lists are rather slow to work with, because you have to dereference an arbitrary address to get from one element to the next. If you can get your input in an std::vector or a plain array, that would be faster; and if you can't, it's probably worthless to bother with parallelization.

Fastest Possible Struct-of-Arrays to Array-of-Structs Conversion

I have a structure that looks like this:
struct SoA
{
int arr1[COUNT];
int arr2[COUNT];
};
And I want it to look like this:
struct AoS
{
int arr1_data;
int arr2_data;
};
std::vector<AoS> points;
as quickly as possible. Order must be preserved.
Is constructing each AoS object individually and pushing it back the fastest way to do this, or is there a faster option?
SoA before;
std::vector<AoS> after;
for (int i = 0; i < COUNT; i++)
points.push_back(AoS(after.arr1[i], after.arr2[i]));
There are SoA/AoS related questions on StackOverflow, but I haven't found one related to fastest-possible conversion. Because of struct packing differences I can't see any way to avoid copying the data from one format to the next, but I'm hoping someone can tell me there's a way to simply reference the data differently and avoid a copy.
Off the wall solutions especially encouraged.
Binary layout of SoA and AoS[]/std::vector<AoS> is different, so there is really no way to transform one to another without copy operation.
Code you have is pretty close to optimal - one improvement maybe to pre-allocate vector with expected number of elements. Alternatively try raw array with both constructing whole element and per-property initialization. Changes need to be measured carefully (definitely measure using fully optimized build with array sizes you expect) and weighted against readabilty/correctness of the code.
If you don't need exact binary layout (seem to be that case as you are using vector) you may be able to achieve similarly looking syntax by creating couple custom classes that would expose existing data differently. This will avoid copying altogether.
You would need "array" type (provide indexing/iteration over instance of SoA) and "element" type (initialized with referece to instance of SoA and index, exposing accessors for separate fields at that index)
Rough sketch of code (add iterators,...):
class AoS_Element
{
SoA& soa;
int index;
public:
AoS_Element(SoA& soa, int index) ...
int arr1_data() { return soa.arr1[index];}
int arr2_data() { return soa.arr2[index];}
}
class AoS
{
SoA& soa;
public:
AoS(SoA& _soa):soa(_soa){}
AoS_Element operator[](int index) { return AoS_Element(soa, index);}
}

Boost: Big graphs & Multithreading

I need to create a directed graph that can be quite large from a big dataset. I know these things for sure:
Each node has at most K outgoing edges
I have a list (unordered_map) of N >> K nodes
The graph is build by comparing all nodes with each other (yes, O(N^2) unfortunately)
Thinking about it, I would parallelize the graph creation using std::thread, and I was wondering if this could be done via Boost Graph Library.
If I use the adjacency matrix, it should be possible to preallocate the matrix (K*N elements), and hence it would be thread-safe to insert all adjacent nodes.
I've read that BGL could be thread-unsafe, but the posts I've found are three years old.
Do you know if it's possible to do what I'm thinking? Do you recommend doing otherwise?
Cheers!
Almost any graph algorithm in BGL needs a mapping: vertex -> int which assigns to each vertex a unique integer within the range [0, num_vertices(g) ). This mapping is known as "vertex_index" and is usually accessible as property_map.
Having said that, I can assume your vertices are already integers or associated with some integers (e.g. your unordered_map has some extra field in "mapped_type"). Even better (for performance and memory) if your input vertices are stored in continuous tight array, e.g. std::vector, then indexing is natural.
If vertices are [associated with] integers, your best choice for memory-tight graph is "Compressed Sparse Row Graph". The graph is immutable, so you need to populate edges container before you generate a graph.
As ravenspoint explained, your best choice is to equip each thread with its own local container of results and lock the central container only when merging the local result into the final one. Such strategy is implemented lock-less by TBB template tbb::parallel_reduce. So your full code for graph building can look roughly as below:
#include "tbb/blocked_range2d.h"
#include "tbb/parallel_reduce.h"
#include "boost/graph/compressed_sparse_row_graph.hpp"
typedef something vertex; //e.g.something is integer giving index of a real data
class EdgeBuilder
{
public:
typedef std::pair<int,int> edge;
typedef std::vector<edge> Edges;
typedef ActualStorage Input;
EdgeBuilder(const Input & input):_input(input){} //OPTIONAL: reserve some space in _edges
EdgeBuilder( EdgeBuilder& parent, tbb::split ): _input(parent.input){} // reserve something
void operator()( const const tbb::blocked_range2d<size_t>& r )
{
for( size_t i=r.rows().begin(); i!=r.rows().end(); ++i ){
for( size_t j=r.cols().begin(); j!=r.cols().end(); ++j ) {
//I assume you provide some function to compute existence
if (my_func_edge_exist(_input,i, j))
m_edges.push_back(edge(i,j));
}
}
}
//merges local results from two TBB threads
void join( EdgeBuilder& rhs )
{
m_edges.insert( m_edges.end(), rhs.m_edges.begin(), rhs.m_edges.end() );
}
Edges _edges; //for a given interval of vertices
const Input & _input;
};
//full flow:
boost::compressed_sparse_row_graph<>* build_graph( const Storage & vertices)
{
EdgeBuilder builder(vertices);
tbb::blocked_range2d<size_t,size_t> range(0,vertices.size(), 100, //row grain size
0,vertices.size(), 100); //col grain size
tbb::parallel_reduce(range, builder);
boost::compressed_sparse_row_graph<>
theGraph = new boost::compressed_sparse_row_graph<>
(boost::edges_are_unsorted_multi_pass_t,
builder._edges.begin(), builder._edges.end(),
vertices.size() );
return theGraph;
}
I think you should break your goal down into two separate sub-goals.
Create the links between nodes by doing the N * ( N - 1 ) tests of pairs of nodes. You appear to have an idea of how to break this up into independent threads. Store the results in a data structure that you know is thread safe, without worrying about the mysteries of boost:graph.
Create the boost::graph from your nodes and ( just created ) links.
A note about storing the links created in each thread: It is not so easy to find a suitable thread safe data structure. If you use a STL dynamically allocated structure, then you have to worry about making a thread safe allocator which is a challenge. If you pre-allocate, then there is a lot of meessy code to handle the allocations. So, I would suggest storing the links created by each thread in a separate data structure, so they do not have to be thread safe. When the links are all created, you can loop over the links created by each thread one by one.
A slightly more efficient design could be imagined, but will require a lot of arcane knowledge about thread safety. The design I propose can be implemented without arcane knowledge or tricky code and will therefore be implemented more quickly and more robustly and will be easier to maintain.

Better understanding the LRU algorithm

I need to implement a LRU algorithm in a 3D renderer for texture caching. I write the code in C++ on Linux.
In my case I will use texture caching to store "tiles" of image data (16x16 pixels block). Now imagine that I do a lookup in the cache, get a hit (tile is in the cache). How do I return the content of the "cache" for that entry to the function caller? I explain. I imagine that when I load a tile in the cache memory, I allocate the memory to store 16x16 pixels for example, then load the image data for that tile. Now there's two solutions to pass the content of the cache entry to the function caller:
1) either as pointer to the tile data (fast, memory efficient),
TileData *tileData = cache->lookup(tileId); // not safe?
2) or I need to recopy the tile data from the cache within a memory space allocated by the function caller (copy can be slow).
void Cache::lookup(int tileId, float *&tileData)
{
// find tile in cache, if not in cache load from disk add to cache, ...
...
// now copy tile data, safe but ins't that slow?
memcpy((char*)tileData, tileDataFromCache, sizeof(float) * 3 * 16 * 16);
}
float *tileData = new float[3 * 16 * 16]; // need to allocate the memory for that tile
// get tile data from cache, requires a copy
cache->lookup(tileId, tileData);
I would go with 1) but the problem is, what happens if the tile gets deleted from the cache just after the lookup, and that the function tries to access the data using the return pointer? The only solution I see to this, is to use a form of referencing counting (auto_ptr) where the data is actually only deleted when it's not used anymore?
the application might access more than 1 texture. I can't seem to find of a way of creating a key which is unique to each texture and each tile of a texture. For example I may have tile 1 from file1 and tile1 from file2 in the cache, so making the search on tildId=1 is not enough... but I can't seem to find a way of creating the key that accounts for the file name and the tileID. I can build a string that would contain the file name and the tileID (FILENAME_TILEID) but wouldn't a string used as a key be much slower than an integer?
Finally I have a question regarding time stamp. Many papers suggest to use a time stamp for ordering the entry in the cache. What is a good function to use a time stamp? the time() function, clock()? Is there a better way than using time stamps?
Sorry I realise it's a very long message, but LRU doesn't seem as simple to implement than it sounds.
Answers to your questions:
1) Return a shared_ptr (or something logically equivalent to it). Then all of the "when-is-it-safe-to-delete-this-object" issues pretty much go away.
2) I'd start by using a string as a key, and see if it actually is too slow or not. If the strings aren't too long (e.g. your filenames aren't too long) then you may find it's faster than you expect. If you do find out that string-keys aren't efficient enough, you could try something like computing a hashcode for the string and adding the tile ID to it... that would probably work in practice although there would always be the possibility of a hash-collision. But you could have a collision-check routine run at startup that would generate all of the possible filename+tileID combinations and alert you if map to the same key value, so that at least you'd know immediately during your testing when there is a problem and could do something about it (e.g. by adjusting your filenames and/or your hashcode algorithm). This assumes that what all the filenames and tile IDs are going to be known in advance, of course.
3) I wouldn't recommend using a timestamp, it's unnecessary and fragile. Instead, try something like this (pseudocode):
typedef shared_ptr<TileData *> TileDataPtr; // automatic memory management!
linked_list<TileDataPtr> linkedList;
hash_map<data_key_t, TileDataPtr> hashMap;
// This is the method the calling code would call to get its tile data for a given key
TileDataPtr GetData(data_key_t theKey)
{
if (hashMap.contains_key(theKey))
{
// The desired data is already in the cache, great! Just move it to the head
// of the LRU list (to reflect its popularity) and then return it.
TileDataPtr ret = hashMap.get(theKey);
linkedList.remove(ret); // move this item to the head
linkedList.push_front(ret); // of the linked list -- this is O(1)/fast
return ret;
}
else
{
// Oops, the requested object was not in our cache, load it from disk or whatever
TileDataPtr ret = LoadDataFromDisk(theKey);
linkedList.push_front(ret);
hashMap.put(theKey, ret);
// Don't let our cache get too large -- delete
// the least-recently-used item if necessary
if (linkedList.size() > MAX_LRU_CACHE_SIZE)
{
TileDataPtr dropMe = linkedList.tail();
hashMap.remove(dropMe->GetKey());
linkedList.remove(dropMe);
}
return ret;
}
}
In the same order as your questions:
Copying over the texture date does not seem reasonable from a performance standpoint. Reference counting sound far better, as long as you can actually code it safely. The data memory would be freed as soon as it is not used by the renderer or have a reference stored in the cache.
I assume that you are going to use some sort of hash table for the look-up part of what you are describing. The common solution to your problem has two parts:
Using a suitable hashing function that combines multiple values e.g. the texture file name and the tile ID. Essentially you create a composite key that is treated as one entity. The hashing function could be a XOR operation of the hashes of all elementary components, or something more complex.
Selecting a suitable hash function is critical for performance reasons - if the said function is not random enough, you will have a lot of hash collisions.
Using a suitable composite equality check to handle the case of hash collisions.
This way you can look-up the combination of all attributes of interest in a single hash table look-up.
Using timestamps for this is not going to work - period. Most sources regarding caching usually describe the algorithms in question with network resource caching in mind (e.g. HTTP caches). That is not going to work here for three reasons:
Using natural time only makes sense of you intend to implement caching policies that take it into account, e.g. dropping a cache entry after 10 minutes. Unless you are doing something very weird something like this makes no sense within a 3D renderer.
Timestamps have a relatively low actual resolution, even if you use high precision timers. Most timer sources have a precision of about 1ms, which is a very long time for a processor - in that time your renderer would have worked through several texture entries.
Do you have any idea how expensive timer calls are? Abusing them like this could even make your system perform worse than not having any cache at all...
The usual solution to this problem is to not use a timer at all. The LRU algorithm only needs to know two things:
The maximum number of entries allowed.
The order of the existing entries w.r.t. their last access.
Item (1) comes from the configuration of the system and typically depends on the available storage space. Item (2) generally implies the use of a combined linked list/hash table data structure, where the hash table part provides fast access and the linked list retains the access order. Each time an entry is accessed, it is placed at the end of the list, while old entries are removed from its start.
Using a combined data structure, rather than two separate ones allows entries to be removed from the hash table without having to go through a look-up operation. This improves the overall performance, but it is not absolutely necessary.
As promised I am posting my code. Please let me know if I have made mistakes or if I could improve it further. I am now going to look into making it work in a multi-threaded environment. Again thanks to Jeremy and Thkala for their help (sorry the code doesn't fit the comment block).
#include <cstdlib>
#include <cstdio>
#include <memory>
#include <list>
#include <unordered_map>
#include <cstdint>
#include <iostream>
typedef uint32_t data_key_t;
class TileData
{
public:
TileData(const data_key_t &key) : theKey(key) {}
data_key_t theKey;
~TileData() { std::cerr << "delete " << theKey << std::endl; }
};
typedef std::shared_ptr<TileData> TileDataPtr; // automatic memory management!
TileDataPtr loadDataFromDisk(const data_key_t &theKey)
{
return std::shared_ptr<TileData>(new TileData(theKey));
}
class CacheLRU
{
public:
// the linked list keeps track of the order in which the data was accessed
std::list<TileDataPtr> linkedList;
// the hash map (unordered_map is part of c++0x while hash_map isn't?) gives quick access to the data
std::unordered_map<data_key_t, TileDataPtr> hashMap;
CacheLRU() : cacheHit(0), cacheMiss(0) {}
TileDataPtr getData(data_key_t theKey)
{
std::unordered_map<data_key_t, TileDataPtr>::const_iterator iter = hashMap.find(theKey);
if (iter != hashMap.end()) {
TileDataPtr ret = iter->second;
linkedList.remove(ret);
linkedList.push_front(ret);
++cacheHit;
return ret;
}
else {
++cacheMiss;
TileDataPtr ret = loadDataFromDisk(theKey);
linkedList.push_front(ret);
hashMap.insert(std::make_pair<data_key_t, TileDataPtr>(theKey, ret));
if (linkedList.size() > MAX_LRU_CACHE_SIZE) {
const TileDataPtr dropMe = linkedList.back();
hashMap.erase(dropMe->theKey);
linkedList.remove(dropMe);
}
return ret;
}
}
static const uint32_t MAX_LRU_CACHE_SIZE = 8;
uint32_t cacheMiss, cacheHit;
};
int main(int argc, char **argv)
{
CacheLRU cache;
for (uint32_t i = 0; i < 238; ++i) {
int key = random() % 32;
TileDataPtr tileDataPtr = cache.getData(key);
}
std::cerr << "Cache hit: " << cache.cacheHit << ", cache miss: " << cache.cacheMiss << std::endl;
return 0;
}

How to dynamically set the number and behavior of inputs/outputs in a neural network?

How would one implement a feed-forward neural network with a configurable number and dynamic behavior of inputs and outputs?
I am trying to add neural networks to the entities in a game I'm working on. However, for every entity type I add I have to create a new neural network with a different number of inputs and outputs, then hard-code how the inputs are set and how the outputs are used to direct behavior.
I would like to find a way to dynamically set all of this, so I don't have to rewrite a new neural net for each entity type.
As I am using C++, I currently have a vector of doubles as the input and output containers. Currently my NN algorithm iterates through every element in a layer (including the input "layer") and passes the information to the next layer, I believe this will work fine for now (though I'm open to suggestions). However, my real issue is how to have different behavior for each type of entity without limiting the number of inputs/outputs, or the types of senses/behaviors an entity is allowed to possess.
As an example, say I want to add a creature to the game that can see other creatures, smell food, bite as an attack, and move along the ground. Each eye would be an input, along with the sense of smell; biting would be an output, along with x and y movement. I would need a way to calculate the input values, and extract meaning from the output values in the neural net.
Now if I also wanted to add a creature that can smell other creatures, locate their direction from itself, shoot spines, and float through the air, I would need a different number of input and output calculations (input: smell, location; output: shoot, x, y, z movement).
I would like each entity type to have it's own neural net structure, yet have an overall standard interface for the AI system to work with when handling and iterating through each individual network. More specifically, when handling game-senses to input conversion, and output to game-behavior conversion.
I want emergent behavior from the creatures I add, so I don't know what the "correct" output will be. Because of this, I'm using a simple genetic algorithm to control weight evolution.
Since I haven't been able to find much information regarding my issue, the only idea I've come up with so far is to implement each entity's senses and behaviors as a vector of function pointers, with each function corresponding to a particular input or output. While this allows me to customize how each entity works, and retain a single system for the AI, I'm not sure if this is the most efficient way of accomplishing what I want.
The process function does all of the work in the LearningSystem class:
void LearningSystem::process(int const last_frame_time) {
std::set<unsigned int> const& learning_list = eManager->getAllEntitiesPossessingComponent(ComponentType::intelligence);
vector<double> outputs, inputs;
for (auto entity : learning_list) {
Intelligence& intel = eManager->getComponent<Intelligence>(entity, ComponentType::intelligence);
Sensors& sensor = eManager->getComponent<Sensors>(entity, ComponentType::sensors);
Behavior& behavior = eManager->getComponent<Behavior>(entity, ComponentType::behavior);
// calculate each input value
for (unsigned int i = 0; i < sensor.sensor_list.size(); ++i) {
sensor.triggers[i](sensor.sensor_list[i]);
}
// retrieve the inputs from the sensors...
inputs = sensor.sensor_list;
// ...and add the bias
inputs.push_back(bias);
// for each layer
for (auto i : intel.vecLayers) {
// clear the internal outputs
outputs.clear();
// for each neuron
for (auto j : i.vecNeurons) {
// reset the neuron value
double neuronValue = 0.0;
// for each weight/input pair, sum the weights * inputs
for (auto k = j.vecWeights.begin(), in = inputs.begin(); k != j.vecWeights.end(); ++k, ++in) {
neuronValue += (*k) * (*in);
}
// store the internal outputs for use by the next layer
outputs.push_back(sigmoid(neuronValue));
}
// assign the inputs for the next layer...
inputs = outputs;
// ...and add the bias
inputs.push_back(bias);
}
behavior.values = outputs;
// calculate actions based on output values
for (unsigned int i = 0; i < behavior.values.size(); ++i) {
behavior.actions[i](behavior.values[i]);
}
}
}
I am curious about other ways of implementing this idea, and if there are any resources which address this kind of issue. Any help would be greatly appreciated.
I wrote something like this a long time ago, so unfortunately I don't have the source, but I remember that I defined the structure of the network as an array that was passed to a function that would create the network. Each element of the array was an int that described the number of neurons in the network layer, so [2,3,2] for example would have created a neural network with 2 input neurons, 3 in the hidden layer and 2 output neurons. Synapses were created automatically by linking every neuron in neighboring layers. It was very simple so setting/getting values from the input/output layers was done with a function call like this
double getValue(int layer, int neuron);
Sorry this is a bit vague but that's all I can remember.