Comparing objects in C and C++ - c++

I'm working on a project where I should instrument a program (written in C and C++) by inserting a print statement before
the statements that respect some criteria. Then, I should compare those values for different executions.
Since in C there are structures, while in C++ one can also define classes, I was wondering if there is a particular method that:
Permits to print primitives as well as complex data structures.
Permits to compare those values, for different executions, based on the format used by the print module (point 1.).
Just an example to clarify my question. Suppose that I have two different executions with this data structure:
struct Point {
int x, y;
}
int main() {
int k = random();
Point p = foo(k);
some_print(p); // Print the value of 'p' in a file
return 0;
}
and then, another module will compare the two values of the point 'p' generated with the two executions.

The pragmatic C++-way of printing an object is defining a friend function:
std::ostream& operator<<(std::ostream& os, const Point& point) {
return os << "(" << point.x << "," << point.y << ")";
}
It's usually class-specific so you need to implement it yourself; however, you might use some form of reflection. Particularly interesting is a CppCon-talk from Antony Polukhin [1] which gives reflection for POD types (like Point above). Generic reflection without external tools is N/A yet (as of 2016), there's a proposal on it. If you can't / don't want to wait, you can do multiple things:
Parse C++ code: ctags comes to mind.
Macros: It's relatively easy to write a FIELDS macro that defines a reflection class and the fields.
FIELDS(
(int)x,
(int)y
)
Tuples: Works only if you define all your fields on the same inheritance level. Inherit privately from a std::tuple<> which contains all your fields. Make const and optionally non-const getters for fields in terms of std::get<>. Then you can iterate over the types of your tuple.
(Would love to add more - pls. write comments if you have ideas.)
All the reflection methods also give you operator==() basically for free. Note that it's more pragmatic to add operator<() when possible. The former can be defined in terms of the first (albeit suboptimally: a == b iff !(a < b) && !(b < a) ) and the latter gives you std::set<> and std::map<>. Or you can do all the comparisons in terms of reflection.
[1] https://www.youtube.com/watch?v=abdeAew3gmQ

what you could do in c++, is to implement an equals method in your specific class.
That way what you could do is have a boolean equals() method, that checks if the objects are similar.
object1.equals(object2) could return either true or false.
to give an example with this answer, take a look at the following(an example i found online):
class car
{
private:
std::string m_make;
std::string m_model;
bool operator== (const Car &c1, const Car &c2)
{
return (c1.m_make== c2.m_make &&
c1.m_model== c2.m_model);
}
}
something like this should be implemented in your own class.

Related

How to declare an operator overloading < to dynamically shift between class variables whenever needed? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I've been searching lots of sites and browsed few books about this but failed to come across to find good sources about implementation of how to dynamically (within execution of the program) compare different datatypes in a class with less- < or , bigger than > operators.
Let's say we've the following snippet code:
#include <iostream>
using namespace std;
class OP
{
private:
string alias;
float locatorX;
int coordinate;
public:
bool operator<(const OP& rhs)
{
return (this->locatorX < rhs.locatorX);
//Here! How do I do to make the compiler understand that
//I compare i.e alias or coordinate whenever I need?
//I tried with:
return (this->coordinate < rhs.coordinate, this->alias < rhs.alias);
//But it didn't really do the trick when implemented
//a sort algorithm inside main as it failed to sort a string array.
}
};
EDIT:
Since most of the kind people here did not understand the question, here is a scenario which you hopefully get.
Let us say we want to create a map that accepts a string, int and float types. We create a function inside of the class OP that accepts all given datatypes and saves them in the created class array. And so we have i.e 15 records in our class array.
How do I do so that I can dynamically bubble sort (with help of < operator), alias (string) locatorX(float) and coordinate(int) (whichever I choose) in ascending order with less than operator?
For example I somewhat need to sort coordinates or alias (if needed) at run time. How do I do this?
Example output:
(First position in array):
"Albert street 5th"
Coordinate: 1691
locatorX: 19.52165
(Second position in array):
"Main street 7th alley"
Coordinate: 59
locatorX: 8175. 12
(Third position in array):
"Elm/Kentucky"
Coordinate: 9517
locatorX: 271.41
Typically you'd create a separate comparator for each comparison you wish to implement. You can't munge them into a single operator< and, although you could technically produce a different function that performed a different comparison depending on the value of some new, third argument, it would be incompatible with almost everything currently existing that knows how to work with comparators.
This is one of the scenarios in which operator overloading specifically is the wrong tool for the job.
There seems to be several ways to do so:
Switch between comparison functions at the call site
You have to define separate compare functions for different fields.
std::vector<Object> v;
enum class OrderBy
{
alias,
coordinate
}
OrderBy order_by = get_orderBy_from_user();
switch (order_by)
{
case OrderBy::alias:
std::sort(v.begin(), v.end(), compare_by_alias());
break;
case OrderBy::coordinate:
std::sort(v.begin(), v.end(), compare_by_coordinate());
break;
}
Make a choice inside a comparison function.
You must communicate the choice of ordering field somehow into the function.
The options are: global or singleton "configuration" object, member variable in the comparison class. I would avoid any globals, thus the second option:
struct compare_by_field
{
OrderBy order_by_;
compare_by_field(OrderBy order_by) : order_by_(order_by)
{}
bool operator()(const Object & lhs, const Object & rhs) const
{
switch (order_by_)
{
case OrderBy::alias:
return lhs.alias < rhs.alias;
case OrderBy::coordinate:
return lhs.coordinate < rhs.coordinate;
}
}
}
std::sort(v.begin(), v.end(), compare_by_field(get_order_by_from_user()));

C++ array of pointers - initializing and accessing objects

Having problems with array of pointers.
I have a custom class called C. C has a variable double c1. I have to sort an array of C-s by c1 using a custom written sorting algorithm. I am guessing that since I have to move objects in array, it would be much more eficient to just move pointers to objects, therefore I have to use not an array of objects, but an array of pointers to objects.
I initialized the array like that:
C** someC;
someC = new C*[size];
for(int i = 0; i < size; i++) {
// a and b are of type CPoint
someC[i] = new C(a,b);
}
Am I doing this part correctly? It is the calling of C objects that then causes problems:
someC[i]->a.x
gives me an error: left of '->a' must point to class/struct/union/generic type
I am new to C++ so I may be missing something obvious, but I did some research and did not find anything. Maybe I am not understanding well how pointers work...
UPDATE
The header file of C class:
#pragma once
class C
{
public:
CPoint a;
CPoint b;
double c1;
C(void);
C(CPoint,CPoint);
~C(void);
};
the implementation:
#include "StdAfx.h"
#include "C.h"
#include <math.h>
C::C(void)
{
}
C::C(CPoint a, CPoint b)
{
this->a=a;
this->b=b;
double c1_x = a.x - b.x;
double c1_y = a.y - b.y;
c1= sqrt( (c1_x * c1_x) + (c1_y * c1_y));
}
C::~C(void)
{
}
UPDATE
The problem was in the code I provided in the comments, I did not notice I was calling the array in a wrong way like this:
pDC->MoveTo(someC[i]->a.x, someC->a.y)
So the second call was incorrect. Thank you all for your help
Philosophy aside, this is pretty telling from your comment (emphasis added):
"I am actually calling someC in OnDraw method like that: pDC->MoveTo(someC[i]->a.x, someC->a.y); someC is defined as public in the header file"
Specifically, this in your parameter list :
someC[i]->a.x, someC->a.y
This tells me one of these is wrong. Judging by your error, I'm going to go with the first one. It would solidify that if we could see the definition of your object that is implementing OnDraw() and where exactly it is getting someC from.
If someC is a C* in your containing object, the second parameter is correct, the first is wrong.
If someC is a C** in your contained object then the first parameter is correct and the second is wrong.
Unless your C objects are really really expensive to copy construct, don't bother implementing your custom sort algorithm, but define a strict total order over C:
bool operator<(C const& lhs, C const& rhs) {
return lhs.c1 < rhs.c1;
}
and use std::sort on an std::vector<C>. If you do worry about copy construction overhead, you can also directly use an std::set<C> which automatically sorts itself, without copy construction.
After your Edit: Your C seems relatively small and easy to copy, but it's borderline; Your best bet is to give both approaches (set and vector) and benchmark which one is faster.
If your type contains just a double I'd guess that it would be much faster not to use pointers! If the objects contain a std::string or a std::vector<T> (for some time T) the picture probably changes but compared to moving a structure with one or two basic objects, the cost of accessing more or less randomly distributed data is fairly high. Of course, to determine the specific situation you would need to profile both approaches.

C++ - Check if One Array of Strings Contains All Elements of Another

I've recently been porting a Python application to C++, but am now at a loss as to how I can port a specific function. Here's the corresponding Python code:
def foo(a, b): # Where `a' is a list of strings, as is `b'
for x in a:
if not x in b:
return False
return True
I wish to have a similar function:
bool
foo (char* a[], char* b[])
{
// ...
}
What's the easiest way to do this? I've tried working with the STL algorithms, but can't seem to get them to work. For example, I currently have this (using the glib types):
gboolean
foo (gchar* a[], gchar* b[])
{
gboolean result;
std::sort (a, (a + (sizeof (a) / sizeof (*a))); // The second argument corresponds to the size of the array.
std::sort (b, (b + (sizeof (b) / sizeof (*b)));
result = std::includes (b, (b + (sizeof (b) / sizeof (*b))),
a, (a + (sizeof (a) / sizeof (*a))));
return result;
}
I'm more than willing to use features of C++11.
I'm just going to add a few comments to what others have stressed and give a better algorithm for what you want.
Do not use pointers here. Using pointers doesn't make it c++, it makes it bad coding. If you have a book that taught you c++ this way, throw it out. Just because a language has a feature, does not mean it is proper to use it anywhere you can. If you want to become a professional programmer, you need to learn to use the appropriate parts of your languages for any given action. When you need a data structure, use the one appropriate to your activity. Pointers aren't data structures, they are reference types used when you need an object with state lifetime - i.e. when an object is created on one asynchronous event and destroyed on another. If an object lives it's lifetime without any asynchronous wait, it can be modeled as a stack object and should be. Pointers should never be exposed to application code without being wrapped in an object, because standard operations (like new) throw exceptions, and pointers do not clean themselves up. In other words, pointers should always be used only inside classes and only when necessary to respond with dynamic created objects to external events to the class (which may be asynchronous).
Do not use arrays here. Arrays are simple homogeneous collection data types of stack lifetime of size known at compiletime. They are not meant for iteration. If you need an object that allows iteration, there are types that have built in facilities for this. To do it with an array, though, means you are keeping track of a size variable external to the array. It also means you are enforcing external to the array that the iteration will not extend past the last element using a newly formed condition each iteration (note this is different than just managing size - it is managing an invariant, the reason you make classes in the first place). You do not get to reuse standard algorithms, are fighting decay-to-pointer, and generally are making brittle code. Arrays are (again) useful only if they are encapsulated and used where the only requirement is random access into a simple type, without iteration.
Do not sort a vector here. This one is just odd, because it is not a good translation from your original problem, and I'm not sure where it came from. Don't optimise early, but don't pessimise early by choosing a bad algorithm either. The requirement here is to look for each string inside another collection of strings. A sorted vector is an invariant (so, again, think something that needs to be encapsulated) - you can use existing classes from libraries like boost or roll your own. However, a little bit better on average is to use a hash table. With amortised O(N) lookup (with N the size of a lookup string - remember it's amortised O(1) number of hash-compares, and for strings this O(N)), a natural first way to translate "look up a string" is to make an unordered_set<string> be your b in the algorithm. This changes the complexity of the algorithm from O(NM log P) (with N now the average size of strings in a, M the size of collection a and P the size of collection b), to O(NM). If the collection b grows large, this can be quite a savings.
In other words
gboolean foo(vector<string> const& a, unordered_set<string> const& b)
Note, you can now pass constant to the function. If you build your collections with their use in mind, then you often have potential extra savings down the line.
The point with this response is that you really should never get in the habit of writing code like that posted. It is a shame that there are a few really (really) bad books out there that teach coding with strings like this, and it is a real shame because there is no need to ever have code look that horrible. It fosters the idea that c++ is a tough language, when it has some really nice abstractions that do this easier and with better performance than many standard idioms in other languages. An example of a good book that teaches you how to use the power of the language up front, so you don't build bad habits, is "Accelerated C++" by Koenig and Moo.
But also, you should always think about the points made here, independent of the language you are using. You should never try to enforce invariants outside of encapsulation - that was the biggest source of savings of reuse found in Object Oriented Design. And you should always choose your data structures appropriate for their actual use. And whenever possible, use the power of the language you are using to your advantage, to keep you from having to reinvent the wheel. C++ already has string management and compare built in, it already has efficient lookup data structures. It has the power to make many tasks that you can describe simply coded simply, if you give the problem a little thought.
Your first problem is related to the way arrays are (not) handled in C++. Arrays live a kind of very fragile shadow existence where, if you as much as look at them in a funny way, they are converted into pointers. Your function doesn't take two pointers-to-arrays as you expect. It takes two pointers to pointers.
In other words, you lose all information about the size of the arrays. sizeof(a) doesn't give you the size of the array. It gives you the size of a pointer to a pointer.
So you have two options: the quick and dirty ad-hoc solution is to pass the array sizes explicitly:
gboolean foo (gchar** a, int a_size, gchar** b, int b_size)
Alternatively, and much nicer, you can use vectors instead of arrays:
gboolean foo (const std::vector<gchar*>& a, const std::vector<gchar*>& b)
Vectors are dynamically sized arrays, and as such, they know their size. a.size() will give you the number of elements in a vector. But they also have two convenient member functions, begin() and end(), designed to work with the standard library algorithms.
So, to sort a vector:
std::sort(a.begin(), a.end());
And likewise for std::includes.
Your second problem is that you don't operate on strings, but on char pointers. In other words, std::sort will sort by pointer address, rather than by string contents.
Again, you have two options:
If you insist on using char pointers instead of strings, you can specify a custom comparer for std::sort (using a lambda because you mentioned you were ok with them in a comment)
std::sort(a.begin(), a.end(), [](gchar* lhs, gchar* rhs) { return strcmp(lhs, rhs) < 0; });
Likewise, std::includes takes an optional fifth parameter used to compare elements. The same lambda could be used there.
Alternatively, you simply use std::string instead of your char pointers. Then the default comparer works:
gboolean
foo (const std::vector<std::string>& a, const std::vector<std::string>& b)
{
gboolean result;
std::sort (a.begin(), a.end());
std::sort (b.begin(), b.end());
result = std::includes (b.begin(), b.end(),
a.begin(), a.end());
return result;
}
Simpler, cleaner and safer.
The sort in the C++ version isn't working because it's sorting the pointer values (comparing them with std::less as it does with everything else). You can get around this by supplying a proper comparison functor. But why aren't you actually using std::string in the C++ code? The Python strings are real strings, so it makes sense to port them as real strings.
In your sample snippet your use of std::includes is pointless since it will use operator< to compare your elements. Unless you are storing the same pointers in both your arrays the operation will not yield the result you are looking for.
Comparing adresses is not the same thing as comparing the true content of your c-style-strings.
You'll also have to supply std::sort with the neccessary comparator, preferrably std::strcmp (wrapped in a functor).
It's currently suffering from the same problem as your use of std::includes, it's comparing addresses instead of the contents of your c-style-strings.
This whole "problem" could have been avoided by using std::strings and std::vectors.
Example snippet
#include <iostream>
#include <algorithm>
#include <cstring>
typedef char gchar;
gchar const * a1[5] = {
"hello", "world", "stack", "overflow", "internet"
};
gchar const * a2[] = {
"world", "internet", "hello"
};
...
int
main (int argc, char *argv[])
{
auto Sorter = [](gchar const* lhs, gchar const* rhs) {
return std::strcmp (lhs, rhs) < 0 ? true : false;
};
std::sort (a1, a1 + 5, Sorter);
std::sort (a2, a2 + 3, Sorter);
if (std::includes (a1, a1 + 5, a2, a2 + 3, Sorter)) {
std::cerr << "all elements in a2 was found in a1!\n";
} else {
std::cerr << "all elements in a2 wasn't found in a1!\n";
}
}
output
all elements in a2 was found in a1!
A naive transcription of the python version would be:
bool foo(std::vector<std::string> const &a,std::vector<std::string> const &b) {
for(auto &s : a)
if(end(b) == std::find(begin(b),end(b),s))
return false;
return true;
}
It turns out that sorting the input is very slow. (And wrong in the face of duplicate elements.) Even the naive function is generally much faster. Just goes to show again that premature optimization is the root of all evil.
Here's an unordered_set version that is usually somewhat faster than the naive version (or was for the values/usage patterns I tested):
bool foo(std::vector<std::string> const& a,std::unordered_set<std::string> const& b) {
for(auto &s:a)
if(b.count(s) < 1)
return false;
return true;
}
On the other hand, if the vectors are already sorted and b is relatively small ( less than around 200k for me ) then std::includes is very fast. So if you care about speed you just have to optimize for the data and usage pattern you're actually dealing with.

C++ domain specific embedded language operators

In numerical oriented languages (Matlab, Fortran) range operator and semantics is very handy when working with multidimensional data.
For example:
A(i:j,k,:n) // represents two-dimensional slice B(i:j,0:n) of A at index k
unfortunately C++ does not have range operator (:). of course it can be emulated using range/slice functor, but semantics is less clean than Matlab. I am prototyping matrix/tensor domain language in C++ and am wondering if there any options to reproduce range operator.
I still would like to rely on C++/prprocessor framework exclusively.
So far I have looked through boost wave which might be an suitable option.
is there any other means to introduce new non-native operators to C++ DSL?
I know you cannot add new operators.am specifically looking for workaround.
One thing I came up (very ugly hack and I do not intend to use):
#define A(r) A[range(((1)?r), ((0)?r))] // assume A overloads []
A(i:j); // abuse ternary operator
A solution that I've used before is to write an external preprocessor that parses the source and replaces any uses of your custom operator with vanilla C++. For your purposes, a : b uses would be replaced with something like a.operator_range_(b), and operator:() declarations with declarations of range_ operator_range_(). In your makefile you then add a rule that preprocesses source files before compiling them. This can be done with relative ease in Perl.
However, having worked with a similar solution in the past, I do not recommend it. It has the potential to create maintainability and portability issues if you do not remain vigilant of how source is processed and generated.
No -- you can't define your own operators in C++. Bjarne Stroustrup details why..
As Billy said, you cannot overload operators. However, you can come very close yo what you want with "regular" operator overloading (and maybe some template metaprogramming). It would be quite easy to allow for something like this:
#include <iostream>
class FakeNumber {
int n;
public:
FakeNumber(int nn) : n(nn) {}
operator int() const { return n; }
};
class Range {
int f, t;
public:
Range(const int& ff, const int& tt) : f(ff), t(tt) {};
int from() const { return f; }
int to() const { return t; }
};
Range operator-(const FakeNumber& a, const int b) {
return Range(a,b);
}
class Matrix {
public:
void operator()(const Range& a, const Range& b) {
std::cout << "(" << a.from() << ":" << a.to() << "," << b.from() << ":" << b.to() << ")" << std::endl;
}
};
int main() {
FakeNumber a=1,b=2,c=3,d=4;
Matrix m;
m(a-b,c-d);
return 0;
}
The downside is that This solution doesn't support all-literal expressions. Either from or to have to be user-defined classes, since we can't overload operator- for two primitive types.
You can also overload operator* to allow specifying stepping, like so:
m(a-b*3,c-d); // equivalent to m[a:b:3,c:d]
And overload both versions of operator-- to allow ignoring one of the bounds:
m(a--,--d); // equivalent to m[a:,:d]
Another option is to define two objects, named something like Matrix::start and Matrix::end, or whatever you like, and then instead of using operator--, you could use them, and then the other bound wouldn't have to be a variable and could be a literal:
m(start-15,38-end); // This clutters the syntax however
And you could of course use both ways.
I think it's pretty much the best you can get without resorting to bizarre solutions, such as custom prebuild tools or macro abuse (of the sort Matthieu presented and suggested against using them:)).
An alternative is to build a C++ variant dialect using a program transformation tool.
The DMS Software Reengineering Toolkit is a program transformation engine, with an industrial strength C++ Front End. DMS, using this front end, can parse full C++ (it even has a preprocessor and can retain most preprocessor directives unexpanded), automatically build ASTs and complete symbol tables.
The C++ front end comes in source, using a grammar derived directly from the standard. It is technically straightforward to add new grammar rules including those that would allow ":" syntax as array subscripts as you have described, and as Fortran90+ has implemented. One can then use the program transformation capability of DMS to transform the "new" syntax into "vanilla" C++ for use in conventional C++ compilers. (This scheme is a generalization of the Intentional Programming model of "add DSL concepts to your language").
We in fact did a concept demonstration of "Vector C++" using this approach.
We added a multidimensional Vector datatype, whose storage semantics are only that array elements are distinct. This is different than C++'s model of sequential locations, but you need this different semantic if you want the compiler/transformer to have freedom to lay out memory arbitrarily, and this is fundamental if you want to use SIMD machine instructions and/or efficient cache accesses along different axes.
We added Fortran-90 style scalar and subarray range accesses, added virtually all of F90's array-processing operations, added a good fraction of APL's matrix operations, all by adjusting the DMS C++ grammar.
Finally, we built two translators using DMS transformational capability: one mapping a significant part of this (remember, this was a concept demo) to vanilla C++ so you could compile and run Vector C++ applications on a typical workstation, and the other mapping C++ to a PowerPC C++ dialect with SIMD instruction extensions, and we generated SIMD code that was pretty reasonable we thought. Took us about 6 man-months to do all this.
The customer for this ultimately bailed out (his business model didn't include supporting a custom compiler in spite of his severe need for parallel/SIMD based operations), and it has been languishing on the shelf. We've chosen not to pursue this in the broader market because it isn't clear what the market really is. I'm pretty sure there are organizations for which this would be valuable.
Point is, you really can do this. It is almost impossible using ad hoc methods. It is technically quite straightforward with a strong enough program transformation system. It isn't a walk in the park.
The easiest solution is to use a method on matrix instead of an operator.
A.range(i, j, k, n);
Note that typically you do not use , in a subscript operator [], eg A[i][j] instead of A[i,j]. The second form could be possible by overloading the comma operator but then you force i and j to be objects not numbers.
You could define a range class that could be used as a subscript for your matrix class.
class RealMatrix
{
public:
MatrixRowRangeProxy operator[] (int i) {
return operator[](range(i, 1));
}
MatrixRowRangeProxy operator[] (range r);
// ...
RealMatrix(const MatrixRangeProxy proxy);
};
// A generic view on a matrix
class MatrixProxy
{
protected:
RealMatrix * matrix;
};
// A view on a matrix of a range of rows
class MatrixRowRangeProxy : public MatrixProxy
{
public:
MatrixColRangeProxy operator[] (int i) {
return operator[](range(i, 1));
}
MatrixColRangeProxy operator[] (const range & r);
// ...
};
// A view on a matrix of a range of columns
class MatrixColRangeProxy : public MatrixProxy
{
public:
MatrixRangeProxy operator[] (int i) {
return operator[](range(i, 1));
}
MatrixRangeProxy operator[] (const range & r);
// ...
};
Then you can copy a range from one matrix into another.
RealMatrix A = ...
RealMatrix B = A[range(i,j)][range(k,n)];
Finally by creating a Matrix class that can hold either a RealMatrix or a MatrixProxy you can make a RealMatrix and a MatrixProxy appear the same from the outside.
Note the operator[] on the proxies are not and cannot be virtual.
If you want to have fun, you may check out IdOp.
If you are really working on a project, I don't suggest using this trick though. Maintenance will suffer from clever tricks.
Your best bet is thus to bite the bullet and use explicit notation. A short function called range which yields a custom defined object for which the operators are overloaded seems especially suitable.
Matrix<10,30,50> matrix = /**/;
MatrixView<5,6,7> view = matrix[range(0,5)][range(0,6)][range(0,7)];
Matrix<5,6,7> n = view;
Note that the operator[] only has 4 overloads (const/non-const + basic int / range) and yields a proxy object (until the last dimension). Once applied to the last dimension, it gives a view of the matrix. A normal matrix may be built from a view that has the same dimensions (non-explicit constructor).

Best way to store constant data in C++

I have an array of constant data like following:
enum Language {GERMAN=LANG_DE, ENGLISH=LANG_EN, ...};
struct LanguageName {
ELanguage language;
const char *name;
};
const Language[] languages = {
GERMAN, "German",
ENGLISH, "English",
.
.
.
};
When I have a function which accesses the array and find the entry based on the Language enum parameter. Should I write a loop to find the specific entry in the array or are there better ways to do this.
I know I could add the LanguageName-objects to an std::map but wouldn't this be overkill for such a simple problem? I do not have an object to store the std::map so the map would be constructed for every call of the function.
What way would you recommend?
Is it better to encapsulate this compile time constant array in a class which handles the lookup?
If the enum values are contiguous starting from 0, use an array with the enum as index.
If not, this is what I usually do:
const char* find_language(Language lang)
{
typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;
static const lang_map_entry_type lang_map_entries[] = { /*...*/ }
static const lang_map_type lang_map( lang_map_entries
, lang_map_entries + sizeof(lang_map_entries)
/ sizeof(lang_map_entries[0]) );
lang_map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
If you consider a map for constants, always also consider using a vector.
Function-local statics are a nice way to get rid of a good part of the dependency problems of globals, but are dangerous in a multi-threaded environment. If you're worried about that, you might rather want to use globals:
typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;
const lang_map_entry_type lang_map_entries[] = { /*...*/ }
const lang_map_type lang_map( lang_map_entries
, lang_map_entries + sizeof(lang_map_entries)
/ sizeof(lang_map_entries[0]) );
const char* find_language(Language lang)
{
lang_map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
There are three basic approaches that I'd choose from. One is the switch statement, and it is a very good option under certain conditions. Remember - the compiler is probably going to compile that into an efficient table-lookup for you, though it will be looking up pointers to the case code blocks rather than data values.
Options two and three involve static arrays of the type you are using. Option two is a simple linear search - which you are (I think) already doing - very appropriate if the number of items is small.
Option three is a binary search. Static arrays can be used with standard library algorithms - just use the first and first+count pointers in the same way that you'd use begin and end iterators. You will need to ensure the data is sorted (using std::sort or std::stable_sort), and use std::lower_bound to do the binary search.
The complication in this case is that you'll need a comparison function object which acts like operator< with a stored or referenced value, but which only looks at the key field of your struct. The following is a rough template...
class cMyComparison
{
private:
const fieldtype& m_Value; // Note - only storing a reference
public:
cMyComparison (const fieldtype& p_Value) : m_Value (p_Value) {}
bool operator() (const structtype& p_Struct) const
{
return (p_Struct.field < m_Value);
// Warning : I have a habit of getting this comparison backwards,
// and I haven't double-checked this
}
};
This kind of thing should get simpler in the next C++ standard revision, when IIRC we'll get anonymous functions (lambdas) and closures.
If you can't put the sort in your apps initialisation, you might need an already-sorted boolean static variable to ensure you only sort once.
Note - this is for information only - in your case, I think you should either stick with linear search or use a switch statement. The binary search is probably only a good idea when...
There are a lot of data items to search
Searches are done very frequently (many times per second)
The key enumerate values are sparse (lots of big gaps) - otherwise, switch is better.
If the coding effort were trivial, it wouldn't be a big deal, but C++ currently makes this a bit harder than it should be.
One minor note - it may be a good idea to define an enumerate for the size of your array, and to ensure that your static array declaration uses that enumerate. That way, your compiler should complain if you modify the table (add/remove items) and forget to update the size enum, so your searches should never miss items or go out of bounds.
I think you have two questions here:
What is the best way to store a constant global variable (with possible Multi-Threaded access) ?
How to store your data (which container use) ?
The solution described by sbi is elegant, but you should be aware of 2 potential problems:
In case of Multi-Threaded access, the initialization could be skrewed.
You will potentially attempt to access this variable after its destruction.
Both issues on the lifetime of static objects are being covered in another thread.
Let's begin with the constant global variable storage issue.
The solution proposed by sbi is therefore adequate if you are not concerned by 1. or 2., on any other case I would recommend the use of a Singleton, such as the ones provided by Loki. Read the associated documentation to understand the various policies on lifetime, it is very valuable.
I think that the use of an array + a map seems wasteful and it hurts my eyes to read this. I personally prefer a slightly more elegant (imho) solution.
const char* find_language(Language lang)
{
typedef std::map<Language, const char*> map_type;
typedef lang_map_type::value_type value_type;
// I'll let you work out how 'my_stl_builder' works,
// it makes for an interesting exercise and it's easy enough
// Note that even if this is slightly slower (?), it is only executed ONCE!
static const map_type = my_stl_builder<map_type>()
<< value_type(GERMAN, "German")
<< value_type(ENGLISH, "English")
<< value_type(DUTCH, "Dutch")
....
;
map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
And now on to the container type issue.
If you are concerned about performance, then you should be aware that for small data collection, a vector of pairs is normally more efficient in look ups than a map. Once again I would turn toward Loki (and its AssocVector), but really I don't think that you should worry about performance.
I tend to choose my container depending on the interface I am likely to need first and here the map interface is really what you want.
Also: why do you use 'const char*' rather than a 'std::string'?
I have seen too many people using a 'const char*' like a std::string (like in forgetting that you have to use strcmp) to be bothered by the alleged loss of memory / performance...
It depends on the purpose of the array. If you plan on showing the values in a list (for a user selection, perhaps) the array would be the most efficient way of storing them. If you plan on frequently looking up values by their enum key, you should look into a more efficient data structure like a map.
There is no need to write a loop. You can use the enum value as index for the array.
I would make an enum with sequential language codes
enum { GERMAN=0, ENGLISH, SWAHILI, ENOUGH };
The put them all into array
const char *langnames[] = {
"German", "English", "Swahili"
};
Then I would check if sizeof(langnames)==sizeof(*langnames)*ENOUGH in debug build.
And pray that I have no duplicates or swapped languages ;-)
If you want fast and simple solution , Can try like this
enum ELanguage {GERMAN=0, ENGLISH=1};
static const string Ger="GERMAN";
static const string Eng="ENGLISH";
bool getLanguage(const ELanguage& aIndex,string & arName)
{
switch(aIndex)
{
case GERMAN:
{
arName=Ger;
return true;
}
case ENGLISH:
{
arName=Eng;
}
default:
{
// Log Error
return false;
}
}
}