I have a structure that looks like this:
struct SoA
{
int arr1[COUNT];
int arr2[COUNT];
};
And I want it to look like this:
struct AoS
{
int arr1_data;
int arr2_data;
};
std::vector<AoS> points;
as quickly as possible. Order must be preserved.
Is constructing each AoS object individually and pushing it back the fastest way to do this, or is there a faster option?
SoA before;
std::vector<AoS> after;
for (int i = 0; i < COUNT; i++)
points.push_back(AoS(after.arr1[i], after.arr2[i]));
There are SoA/AoS related questions on StackOverflow, but I haven't found one related to fastest-possible conversion. Because of struct packing differences I can't see any way to avoid copying the data from one format to the next, but I'm hoping someone can tell me there's a way to simply reference the data differently and avoid a copy.
Off the wall solutions especially encouraged.
Binary layout of SoA and AoS[]/std::vector<AoS> is different, so there is really no way to transform one to another without copy operation.
Code you have is pretty close to optimal - one improvement maybe to pre-allocate vector with expected number of elements. Alternatively try raw array with both constructing whole element and per-property initialization. Changes need to be measured carefully (definitely measure using fully optimized build with array sizes you expect) and weighted against readabilty/correctness of the code.
If you don't need exact binary layout (seem to be that case as you are using vector) you may be able to achieve similarly looking syntax by creating couple custom classes that would expose existing data differently. This will avoid copying altogether.
You would need "array" type (provide indexing/iteration over instance of SoA) and "element" type (initialized with referece to instance of SoA and index, exposing accessors for separate fields at that index)
Rough sketch of code (add iterators,...):
class AoS_Element
{
SoA& soa;
int index;
public:
AoS_Element(SoA& soa, int index) ...
int arr1_data() { return soa.arr1[index];}
int arr2_data() { return soa.arr2[index];}
}
class AoS
{
SoA& soa;
public:
AoS(SoA& _soa):soa(_soa){}
AoS_Element operator[](int index) { return AoS_Element(soa, index);}
}
Related
Suppose I have a class akin to the following:
struct Potato {
Eigen::Vector3d position;
double weight, size;
string name;
};
and a collection of Potatos
std::vector<Potato> potato_farm = {potato1, potato2, ...};
This is pretty clearly an array-of-structures (AoS) layout because, let's say for most purposes, it makes sense to have all of a Potato's data lumped together. However, I might like to do a calculation of the most common name where a structure-of-arrays (SoA) design makes things agnostic to the type of thing with a name (an array of people all with names, an array of places, all with names, etc.) Does C++ have any tools or tricks that makes an AoS layout look like a SoA to do something like this, or is there a better design that accomplishes the same thing?
You can use lambdas to access particular member in algos that work over range:
double mean = std::accumulate( potato_farm.begin(), potato_farm.end(), 0.0, []( double val, const Potato &p ) { return val + p.weight; } ) / potato_farm.size();
if that is not enough you cannot make it look like array of data as that requires objects to be in continuous memory, but you can make it like a container. So you can implement custom iterators (for example random access iterator of type == double which iterates over weight member). How to implement custom iterators is described here. You can probably even make that generic, but it is not clear if that would worse the effort as it is not very simple to implement properly.
Unfortunately, there is no language tool to generically change a struct into SoA. This is actually one of big obstacles when you try to bring SIMD programming into higher level.
You will need to create a SoA manually. However, you can help yourself by creating a reference to SoA objects acting as if it was a regular Potato.
struct Potato {
float position;
double weight, size;
std::string name;
};
struct PotatoSoARef {
float& position;
double& weight;
double& size;
std::string& name;
};
class PotatoSoA {
private:
float* position;
double* weight;
double* size;
std::string* name;
public:
PotatoSoA(std::size_t size) { /* allocate the SoA */ }
PotatoSoARef operator[](std::size_t idx) {
return PotatoSoARef{position[idx], weight[idx], size[idx], name[idx]};
}
};
This way, regardless if you have an AoS or SoA of Potatos, you can access its fields as arr[idx].position etc. (both as r- and l-value). The compiler is likely to optimize the proxy away.
You might want to add other constructors and accessors as well.
You might also be interested in implementing a regular AoS with an operator[] returning a PotatoSoARef if you want functions to have a uniform interface for both AoS and SoA access patterns.
If you are willing to depart from C++ though, you might be interested in language extensions such as Sierra
As Slava has said you aren't going to get SoA-like access out of AoS data without writing your own iterators, and I would think really hard about whether the use of STL algorithms is that important before doing that, especially if this isn't meant to be a generic solution. The primary benefit of SoA data is cache performance anyway, not the particular syntax of whatever containers you're using, and nothing besides actual SoA data is going to get you that.
With range-v3 (not in C++17 :-/ ), you may use Projection or transformation view:
ranges::accumulate(potato_farm, 0., ranges::v3::plus{}, &Potato::weight);
or
auto weightsView = potato_farm | ranges::view::transform([](auto& p) { return p.weight; });
ranges::accumulate(weightsView, 0.);
Problem
Suppose I have a large array of bytes (think up to 4GB) containing some data. These bytes correspond to distinct objects in such a way that every s bytes (think s up to 32) will constitute a single object. One important fact is that this size s is the same for all objects, not stored within the objects themselves, and not known at compile time.
At the moment, these objects are logical entities only, not objects in the programming language. I have a comparison on these objects which consists of a lexicographical comparison of most of the object data, with a bit of different functionality to break ties using the remaining data. Now I want to sort these objects efficiently (this is really going to be a bottleneck of the application).
Ideas so far
I've thought of several possible ways to achieve this, but each of them appears to have some rather unfortunate consequences. You don't necessarily have to read all of these. I tried to print the central question of each approach in bold. If you are going to suggest one of these approaches, then your answer should respond to the related questions as well.
1. C quicksort
Of course the C quicksort algorithm is available in C++ applications as well. Its signature matches my requirements almost perfectly. But the fact that using that function will prohibit inlining of the comparison function will mean that every comparison carries a function invocation overhead. I had hoped for a way to avoid that. Any experience about how C qsort_r compares to STL in terms of performance would be very welcome.
2. Indirection using Objects pointing at data
It would be easy to write a bunch of objects holding pointers to their respective data. Then one could sort those. There are two aspects to consider here. On the one hand, just moving around pointers instead of all the data would mean less memory operations. On the other hand, not moving the objects would probably break memory locality and thus cache performance. Chances that the deeper levels of quicksort recursion could actually access all their data from a few cache pages would vanish almost completely. Instead, each cached memory page would yield only very few usable data items before being replaced. If anyone could provide some experience about the tradeoff between copying and memory locality I'd be very glad.
3. Custom iterator, reference and value objects
I wrote a class which serves as an iterator over the memory range. Dereferencing this iterator yields not a reference but a newly constructed object to hold the pointer to the data and the size s which is given at construction of the iterator. So these objects can be compared, and I even have an implementation of std::swap for these. Unfortunately, it appears that std::swap isn't enough for std::sort. In some parts of the process, my gcc implementation uses insertion sort (as implemented in __insertion_sort in file stl_alog.h) which moves a value out of the sequence, moves a number items by one step, and then moves the first value back into the sequence at the appropriate position:
typename iterator_traits<_RandomAccessIterator>::value_type
__val = _GLIBCXX_MOVE(*__i);
_GLIBCXX_MOVE_BACKWARD3(__first, __i, __i + 1);
*__first = _GLIBCXX_MOVE(__val);
Do you know of a standard sorting implementation which doesn't require a value type but can operate with swaps alone?
So I'd not only need my class which serves as a reference, but I would also need a class to hold a temporary value. And as the size of my objects is dynamic, I'd have to allocate that on the heap, which means memory allocations at the very leafs of the recusrion tree. Perhaps one alternative would be a vaue type with a static size that should be large enough to hold objects of the sizes I currently intend to support. But that would mean that there would be even more hackery in the relation between the reference_type and the value_type of the iterator class. And it would mean I would have to update that size for my application to one day support larger objects. Ugly.
If you can think of a clean way to get the above code to manipulate my data without having to allocate memory dynamically, that would be a great solution. I'm using C++11 features already, so using move semantics or similar won't be a problem.
4. Custom sorting
I even considered reimplementing all of quicksort. Perhaps I could make use of the fact that my comparison is mostly a lexicographical compare, i.e. I could sort sequences by first byte and only switch to the next byte when the firt byte is the same for all elements. I haven't worked out the details on this yet, but if anyone can suggest a reference, an implementation or even a canonical name to be used as a keyword for such a byte-wise lexicographical sorting, I'd be very happy. I'm still not convinced that with reasonable effort on my part I could beat the performance of the STL template implementation.
5. Completely different algorithm
I know there are many many kinds of sorting algorithms out there. Some of them might be better suited to my problem. Radix sort comes to my mind first, but I haven't really thought this through yet. If you can suggest a sorting algorithm more suited to my problem, please do so. Preferrably with implementation, but even without.
Question
So basically my question is this:
“How would you efficiently sort objects of dynamic size in heap memory?”
Any answer to this question which is applicable to my situation is good, no matter whether it is related to my own ideas or not. Answers to the individual questions marked in bold, or any other insight which might help me decide between my alternatives, would be useful as well, particularly if no definite answer to a single approach turns up.
The most practical solution is to use the C style qsort that you mentioned.
template <unsigned S>
struct my_obj {
enum { SIZE = S; };
const void *p_;
my_obj (const void *p) : p_(p) {}
//...accessors to get data from pointer
static int c_style_compare (const void *a, const void *b) {
my_obj aa(a);
my_obj bb(b);
return (aa < bb) ? -1 : (bb < aa);
}
};
template <unsigned N, typename OBJ>
void my_sort (const char (&large_array)[N], const OBJ &) {
qsort(large_array, N/OBJ::SIZE, OBJ::SIZE, OBJ::c_style_compare);
}
(Or, you can call qsort_r if you prefer.) Since STL sort inlines the comparision calls, you may not get the fastest possible sorting. If all your system does is sorting, it may be worth it to add the code to get custom iterators to work. But, if most of the time your system is doing something other than sorting, the extra gain you get may just be noise to your overall system.
Since there are only 31 different object variations (1 to 32 bytes), you could easily create an object type for each and select a call to std::sort based on a switch statement. Each call will get inlined and highly optimized.
Some object sizes might require a custom iterator, as the compiler will insist on padding native objects to align to address boundaries. Pointers can be used as iterators in the other cases since a pointer has all the properties of an iterator.
I'd agree with std::sort using a custom iterator, reference and value type; it's best to use the standard machinery where possible.
You worry about memory allocations, but modern memory allocators are very efficient at handing out small chunks of memory, particularly when being repeatedly reused. You could also consider using your own (stateful) allocator, handing out length s chunks from a small pool.
If you can overlay an object onto your buffer, then you can use std::sort, as long as your overlay type is copyable. (In this example, 4 64bit integers). With 4GB of data, you're going to need a lot of memory though.
As discussed in the comments, you can have a selection of possible sizes based on some number of fixed size templates. You would have to have pick from these types at runtime (using a switch statement, for example). Here's an example of the template type with various sizes and example of sorting the 64bit size.
Here's a simple example:
#include <vector>
#include <algorithm>
#include <iostream>
#include <ctime>
template <int WIDTH>
struct variable_width
{
unsigned char w_[WIDTH];
};
typedef variable_width<8> vw8;
typedef variable_width<16> vw16;
typedef variable_width<32> vw32;
typedef variable_width<64> vw64;
typedef variable_width<128> vw128;
typedef variable_width<256> vw256;
typedef variable_width<512> vw512;
typedef variable_width<1024> vw1024;
bool operator<(const vw64& l, const vw64& r)
{
const __int64* l64 = reinterpret_cast<const __int64*>(l.w_);
const __int64* r64 = reinterpret_cast<const __int64*>(r.w_);
return *l64 < *r64;
}
std::ostream& operator<<(std::ostream& out, const vw64& w)
{
const __int64* w64 = reinterpret_cast<const __int64*>(w.w_);
std::cout << *w64;
return out;
}
int main()
{
srand(time(NULL));
std::vector<unsigned char> buffer(10 * sizeof(vw64));
vw64* w64_arr = reinterpret_cast<vw64*>(&buffer[0]);
for(int x = 0; x < 10; ++x)
{
(*(__int64*)w64_arr[x].w_) = rand();
}
std::sort(
w64_arr,
w64_arr + 10);
for(int x = 0; x < 10; ++x)
{
std::cout << w64_arr[x] << '\n';
}
std::cout << std::endl;
return 0;
}
Given the enormous size (4GB), I would seriously consider dynamic code generation. Compile a custom sort into a shared library, and dynamically load it. The only non-inlined call should be the call into the library.
With precompiled headers, the compilation times may actually be not that bad. The whole <algorithm> header doesn't change, nor does your wrapper logic. You just need to recompile a single predicate each time. And since it's a single function you get, linking is trivial.
#define OBJECT_SIZE 32
struct structObject
{
unsigned char* pObject;
bool operator < (const structObject &n) const
{
for(int i=0; i<OBJECT_SIZE; i++)
{
if(*(pObject + i) != *(n.pObject + i))
return (*(pObject + i) < *(n.pObject + i));
}
return false;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
std::vector<structObject> vObjects;
unsigned char* pObjects = (unsigned char*)malloc(10 * OBJECT_SIZE); // 10 Objects
for(int i=0; i<10; i++)
{
structObject stObject;
stObject.pObject = pObjects + (i*OBJECT_SIZE);
*stObject.pObject = 'A' + 9 - i; // Add a value to the start to check the sort
vObjects.push_back(stObject);
}
std::sort(vObjects.begin(), vObjects.end());
free(pObjects);
To skip the #define
struct structObject
{
unsigned char* pObject;
};
struct structObjectComparerAscending
{
int iSize;
structObjectComparerAscending(int _iSize)
{
iSize = _iSize;
}
bool operator ()(structObject &stLeft, structObject &stRight)
{
for(int i=0; i<iSize; i++)
{
if(*(stLeft.pObject + i) != *(stRight.pObject + i))
return (*(stLeft.pObject + i) < *(stRight.pObject + i));
}
return false;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
int iObjectSize = 32; // Read it from somewhere
std::vector<structObject> vObjects;
unsigned char* pObjects = (unsigned char*)malloc(10 * iObjectSize);
for(int i=0; i<10; i++)
{
structObject stObject;
stObject.pObject = pObjects + (i*iObjectSize);
*stObject.pObject = 'A' + 9 - i; // Add a value to the start to work with something...
vObjects.push_back(stObject);
}
std::sort(vObjects.begin(), vObjects.end(), structObjectComparerAscending(iObjectSize));
free(pObjects);
I have been taught at school to use database with integer IDs, and I want to know if it's also a good way to do so in C/C++. I'm making a game, using Ogre3D, so I'd like my game code to use as few cycles as possible.
This is not the exact code (I'm using vectors and it's about characters and abilities and such), but I'm curious to know if the line where I access the weight is going to cause a bottleneck or not, since I'd doing several array subscript.
struct item
{
float weight;
int mask;
item(): mask(0) {}
}
items[2000];
struct shipment
{
int item_ids[20];
}
shipments[10000];
struct order
{
int shipment_ids[20];
}
orders[3000];
int main()
{
// if I want to access an item's data of a certain order, I do:
for (int i = 0; i < 3000; ++ i)
{
if (items[shipments[orders[4].shipment_ids[5]]].weight > 23.0)
s |= (1<< 31);
}
}
I have heard that putting data into arrays is the best way to gain performance when looping over data repeatedly, I just want to know your opinion on this code...
A good optimizer should be able to compute the exact offset of the memory address each of those items. There is no dependency between loop iterations, so you should be able to get loop unrolled (SIMD processing). Looks great, IMHO. If you can avoid floats, that will also help you.
I am writing a simulation and need some hint on the design. The basic idea is that data for the given stochastic processes is being generated and later on consumed for various calculations. For example for 1 iteration:
Process 1 -> generates data for source 1: x1
Process 2 -> generates data for source 1: x2
and so on
Later I want to apply some transformations for example on the output of source 2, which results in x2a, x2b, x2c. So in the end up with the following vector: [x1, x2a, x2b, x2c].
I have a problem, as for N-multivariate stochastic processes (representing for example multiple correlated phenomenons) I have to generate N dimensional sample at once:
Process 1 -> generates data for source 1...N: x1...xN
I am thinking about the simple architecture that would allow to structuralize the simulation code and provide flexibility without hindering the performance.
I was thinking of something along these lines (pseudocode):
class random_process
{
// concrete processes would generate and store last data
virtual data_ptr operator()() const = 0;
};
class source_proxy
{
container_type<process> processes;
container_type<data_ptr> data; // pointers to the process data storage
data operator[](size_type number) const { return *(data[number]);}
void next() const {/* update the processes */}
};
Somehow I am not convinced about this design. For example, if I'd like to work with vectors of samples instead of single iteration, then above design should be changed (I could for example have the processes to fill the submatrices of the proxy-matrix passed to them with data, but again not sure if this is a good idea - if yes then it would also fit nicely the single iteration case). Any comments, suggestions and criticism are welcome.
EDIT:
Short summary of the text above to summarize the key points and clarify the situation:
random_processes contain the logic to generate some data. For example it can draw samples from multivariate random gaussian with the given means and correlation matrix. I can use for example Cholesky decomposition - and as a result I'll be getting a set of samples [x1 x2 ... xN]
I can have multiple random_processes, with different dimensionality and parameters
I want to do some transformations on individual elements generated by random_processes
Here is the dataflow diagram
random_processes output
x1 --------------------------> x1
----> x2a
p1 x2 ------------transform|----> x2b
----> x2c
x3 --------------------------> x3
p2 y1 ------------transform|----> y1a
----> y1b
The output is being used to do some calculations.
When I read this "the answer" doesn't materialize in my mind, but instead a question:
(This problem is part of a class of problems that various tool vendors in the market have created configurable solutions for.)
Do you "have to" write this or can you invest in tried and proven technology to make your life easier?
In my job at Microsoft I work with high performance computing vendors - several of which have math libraries. Folks at these companies would come much closer to understanding the question than I do. :)
Cheers,
Greg Oliver [MSFT]
I'll take a stab at this, perhaps I'm missing something but it sounds like we have a list of processes 1...N that don't take any arguments and return a data_ptr. So why not store them in a vector (or array) if the number is known at compile time... and then structure them in whatever way makes sense. You can get really far with the stl and the built in containers (std::vector) function objects(std::tr1::function) and algorithms (std::transform)... you didn't say much about the higher level structure so I'm assuming a really silly naive one, but clearly you would build the data flow appropriately. It gets even easier if you have a compiler with support for C++0x lambdas because you can nest the transformations easier.
//compiled in the SO textbox...
#include <vector>
#include <functional>
#include <numerics>
typedef int data_ptr;
class Generator{
public:
data_ptr operator()(){
//randomly generate input
return 42 * 4;
}
};
class StochasticTransformation{
public:
data_ptr operator()(data_ptr in){
//apply a randomly seeded function
return in * 4;
}
};
public:
data_ptr operator()(){
return 42;
}
};
int main(){
//array of processes, wrap this in a class if you like but it sounds
//like there is a distinction between generators that create data
//and transformations
std::vector<std::tr1::function<data_ptr(void)> generators;
//TODO: fill up the process vector with functors...
generators.push_back(Generator());
//transformations look like this (right?)
std::vector<std::tr1::function<data_ptr(data_ptr)> transformations;
//so let's add one
transformations.push_back(StochasticTransformation);
//and we have an array of results...
std::vector<data_ptr> results;
//and we need some inputs
for (int i = 0; i < NUMBER; ++i)
results.push_back(generators[0]());
//and now start transforming them using transform...
//pick a random one or do them all...
std::transform(results.begin(),results.end(),
results.begin(),results.end(),transformation[0]);
};
I think that the second option (the one mentioned in the last paragraph) makes more sense. In the one you had presented you are playing with pointers and indirect access to random process data. The other one would store all the data (either vector or a matrix) in one place - the source_proxy object. The random processes objects are then called with a submatrix to populate as a parameter, and themselves they do not store any data. The proxy manages everything - from providing the source data (for any distinct source) to requesting new data from the generators.
So changing a bit your snippet we could end up with something like this:
class random_process
{
// concrete processes would generate and store last data
virtual void operator()(submatrix &) = 0;
};
class source_proxy
{
container_type<random_process> processes;
matrix data;
data operator[](size_type source_number) const { return a column of data}
void next() {/* get new data from the random processes */}
};
But I agree with the other comment (Greg) that it is a difficult problem, and depending on the final application may require heavy thinking. It's easy to go into the dead-end resulting in rewriting lots of code...
I have an array of constant data like following:
enum Language {GERMAN=LANG_DE, ENGLISH=LANG_EN, ...};
struct LanguageName {
ELanguage language;
const char *name;
};
const Language[] languages = {
GERMAN, "German",
ENGLISH, "English",
.
.
.
};
When I have a function which accesses the array and find the entry based on the Language enum parameter. Should I write a loop to find the specific entry in the array or are there better ways to do this.
I know I could add the LanguageName-objects to an std::map but wouldn't this be overkill for such a simple problem? I do not have an object to store the std::map so the map would be constructed for every call of the function.
What way would you recommend?
Is it better to encapsulate this compile time constant array in a class which handles the lookup?
If the enum values are contiguous starting from 0, use an array with the enum as index.
If not, this is what I usually do:
const char* find_language(Language lang)
{
typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;
static const lang_map_entry_type lang_map_entries[] = { /*...*/ }
static const lang_map_type lang_map( lang_map_entries
, lang_map_entries + sizeof(lang_map_entries)
/ sizeof(lang_map_entries[0]) );
lang_map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
If you consider a map for constants, always also consider using a vector.
Function-local statics are a nice way to get rid of a good part of the dependency problems of globals, but are dangerous in a multi-threaded environment. If you're worried about that, you might rather want to use globals:
typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;
const lang_map_entry_type lang_map_entries[] = { /*...*/ }
const lang_map_type lang_map( lang_map_entries
, lang_map_entries + sizeof(lang_map_entries)
/ sizeof(lang_map_entries[0]) );
const char* find_language(Language lang)
{
lang_map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
There are three basic approaches that I'd choose from. One is the switch statement, and it is a very good option under certain conditions. Remember - the compiler is probably going to compile that into an efficient table-lookup for you, though it will be looking up pointers to the case code blocks rather than data values.
Options two and three involve static arrays of the type you are using. Option two is a simple linear search - which you are (I think) already doing - very appropriate if the number of items is small.
Option three is a binary search. Static arrays can be used with standard library algorithms - just use the first and first+count pointers in the same way that you'd use begin and end iterators. You will need to ensure the data is sorted (using std::sort or std::stable_sort), and use std::lower_bound to do the binary search.
The complication in this case is that you'll need a comparison function object which acts like operator< with a stored or referenced value, but which only looks at the key field of your struct. The following is a rough template...
class cMyComparison
{
private:
const fieldtype& m_Value; // Note - only storing a reference
public:
cMyComparison (const fieldtype& p_Value) : m_Value (p_Value) {}
bool operator() (const structtype& p_Struct) const
{
return (p_Struct.field < m_Value);
// Warning : I have a habit of getting this comparison backwards,
// and I haven't double-checked this
}
};
This kind of thing should get simpler in the next C++ standard revision, when IIRC we'll get anonymous functions (lambdas) and closures.
If you can't put the sort in your apps initialisation, you might need an already-sorted boolean static variable to ensure you only sort once.
Note - this is for information only - in your case, I think you should either stick with linear search or use a switch statement. The binary search is probably only a good idea when...
There are a lot of data items to search
Searches are done very frequently (many times per second)
The key enumerate values are sparse (lots of big gaps) - otherwise, switch is better.
If the coding effort were trivial, it wouldn't be a big deal, but C++ currently makes this a bit harder than it should be.
One minor note - it may be a good idea to define an enumerate for the size of your array, and to ensure that your static array declaration uses that enumerate. That way, your compiler should complain if you modify the table (add/remove items) and forget to update the size enum, so your searches should never miss items or go out of bounds.
I think you have two questions here:
What is the best way to store a constant global variable (with possible Multi-Threaded access) ?
How to store your data (which container use) ?
The solution described by sbi is elegant, but you should be aware of 2 potential problems:
In case of Multi-Threaded access, the initialization could be skrewed.
You will potentially attempt to access this variable after its destruction.
Both issues on the lifetime of static objects are being covered in another thread.
Let's begin with the constant global variable storage issue.
The solution proposed by sbi is therefore adequate if you are not concerned by 1. or 2., on any other case I would recommend the use of a Singleton, such as the ones provided by Loki. Read the associated documentation to understand the various policies on lifetime, it is very valuable.
I think that the use of an array + a map seems wasteful and it hurts my eyes to read this. I personally prefer a slightly more elegant (imho) solution.
const char* find_language(Language lang)
{
typedef std::map<Language, const char*> map_type;
typedef lang_map_type::value_type value_type;
// I'll let you work out how 'my_stl_builder' works,
// it makes for an interesting exercise and it's easy enough
// Note that even if this is slightly slower (?), it is only executed ONCE!
static const map_type = my_stl_builder<map_type>()
<< value_type(GERMAN, "German")
<< value_type(ENGLISH, "English")
<< value_type(DUTCH, "Dutch")
....
;
map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
And now on to the container type issue.
If you are concerned about performance, then you should be aware that for small data collection, a vector of pairs is normally more efficient in look ups than a map. Once again I would turn toward Loki (and its AssocVector), but really I don't think that you should worry about performance.
I tend to choose my container depending on the interface I am likely to need first and here the map interface is really what you want.
Also: why do you use 'const char*' rather than a 'std::string'?
I have seen too many people using a 'const char*' like a std::string (like in forgetting that you have to use strcmp) to be bothered by the alleged loss of memory / performance...
It depends on the purpose of the array. If you plan on showing the values in a list (for a user selection, perhaps) the array would be the most efficient way of storing them. If you plan on frequently looking up values by their enum key, you should look into a more efficient data structure like a map.
There is no need to write a loop. You can use the enum value as index for the array.
I would make an enum with sequential language codes
enum { GERMAN=0, ENGLISH, SWAHILI, ENOUGH };
The put them all into array
const char *langnames[] = {
"German", "English", "Swahili"
};
Then I would check if sizeof(langnames)==sizeof(*langnames)*ENOUGH in debug build.
And pray that I have no duplicates or swapped languages ;-)
If you want fast and simple solution , Can try like this
enum ELanguage {GERMAN=0, ENGLISH=1};
static const string Ger="GERMAN";
static const string Eng="ENGLISH";
bool getLanguage(const ELanguage& aIndex,string & arName)
{
switch(aIndex)
{
case GERMAN:
{
arName=Ger;
return true;
}
case ENGLISH:
{
arName=Eng;
}
default:
{
// Log Error
return false;
}
}
}