Apologies in advanced if this is the wrong site, please let me know if it is!
I've written a function that checks to see whether a key exists in a particular std::map and wondered if this is a good practise to use, and, whether or not anyone can throw any pointers on improvements.
The std::map allows for multiple data-types to be accepted for the value.
union Variants {
int asInt;
char* asStr;
Variants(int in) { asInt = in; }
Variants() { asInt = 0;}
Variants(char* in) { asStr = in; }
operator int() { return asInt; }
operator char*() { return asStr; }
};
template<typename T, typename Y>
bool in_map(T value, std::map<T, Y> &map)
{
if(map.find(value) == map.end()) {
return false;
}else{
return true;
}
}
And I can then use in main the following:
std::map<string, Variants> attributes;
attributes["value1"] = 101;
attributes["value2"] = "Hello, world";
if(in_map<std::string, Variants>("value1", attributes))
{
std::cout << "Yes, exists!";
}
Any help or advise would be greatly appreciated. Sorry if this doesn't comply to the rules or standards. Thanks!
The biggest problem I see with your function is that you're throwing away the resulting iterator.
When you're checking if a key exists in a map, most of the time you want to retrieve/use the associated value after that. Using your function in that case forces you to do a double lookup, at the cost of performance. I would just avoid the use of the function altogether, and write the tests directly, keeping the iterator around for later use in order to avoid useless lookups:
auto it = map_object.find("key");
if (it != map_object.end())
use(it->second);
else
std::cout << "not found" << std::endl;
Of course if you're just checking whether a key exists and don't care for the associated value then your function is fine (taking into account what others told you in the comments) but I think its use cases are quite limited and not really worth the extra function. You could just do:
if (map_object.find("key") != map_object.end())
std::cout << "found, but I don't care about the value" << std::endl;
ny pointers on improvements.
sure.
template<typename T, typename Y>
bool in_map(T value, const std::map<T, Y> &map)
{
return map.find(value) != map.end();
}
And I'd place map as 1st parameter (just a preference). Also, because the whole thing fits into single line, you might not even need this function.
You're also throwing away returned iterator, but since you aren't using it, that's not a problem.
Apart from this, does this look ok in terms of coding practise? I.e. Using Union or are there other types I can use such as struct?
Well, using char* doesn't looke like a good idea, because char* implies that you can modify data. char* also implies that this pointer is dynamically allocated and you might want to delete[] that pointer later. And you can't use destructors in unions. If the text cannot be changed, you could use const char*, otherwise you might want to use different datatype. Also see Rule of Three
Next problem - you're trying to place char* and int at the same location. That implies that at some point you're trying to convert pointer to integer. Which is a bad idea, because on 64bit platform pointer might not fit into int, and you'll get only half of it.
Also, if you're trying to store multiple different values in the same variable, you are not indicating which type is being stored anywhere. To do that you would need to enclose union into struct and add field (into struct) that indicates type of stored object. In this case, however, you'll end up reinventing the wheel. So if you're trying to store "universal" type, you might want to look at Boost.Any, Boost.Variant or QVariant. All of those require BIG external libraries, though (either boost or Qt).
Typing
if(in_map<std::string, Variants>("value1", attributes))
seems a bit excessive to me, typing all of that typename syntax makes me want to just use the map.find function instead just out of convenience. However, depending on your compiler, sometimes the template parameters can be interpreted automatically, for example, visual studio will allow this:
if(in_map(std::string("value1"), attributes))
In this case, I had to construct an std::string object to replace the char*, but I've completely removed the template definition from the call, the compiler still figures out what T and Y are based on the parameters given.
However, my recommended advice would be to use #define to define your "function". While it is not really a function, since #define actually just replaces snippets of code directly into the source, it can make things much easier and visually appealing:
#define in_map(value,map) (map.find(value) != map.end())
Then your code to use it would just look like this:
if(in_map("value1", attributes))
You both get the optimization of not using a function call, and the visual appearance like it does in PHP.
Related
This question is a bump of a question that had a comment here but was deleted as part of the bump.
For those of you who can't see deleted posts, the comment was on my use of const char*s instead of string::const_iterators in this answer: "Iterators may have been a better path from the get go, since it appears that is exactly how your pointers seems be treated."
So my question is this, do iterators hold string::const_iterators hold any intrinsic value over a const char*s such that switching my answer over to string::const_iterators makes sense?
Introduction
There are many perks of using iterators instead of pointers, among them are:
different code-path in release vs debug, and;
better type-safety, and;
making it possible to write generic code (iterators can be made to work with any data-structure, such as a linked-list, whereas intrinsic pointers are very limited in this regard).
Debugging
Since, among other things, dereferencing an iterator that is passed the end of a range is undefined-behavior, an implementation is free to do whatever it feels necessary in such case - including raising diagnostics saying that you are doing something wrong.
The standard library implementation, libstdc++, provided by gcc will issues diagnostics when it detects something fault (if Debug Mode is enabled).
Example
#define _GLIBCXX_DEBUG 1 /* enable debug mode */
#include <vector>
#include <iostream>
int
main (int argc, char *argv[])
{
std::vector<int> v1 {1,2,3};
for (auto it = v1.begin (); ; ++it)
std::cout << *it;
}
/usr/include/c++/4.9.2/debug/safe_iterator.h:261:error: attempt to
dereference a past-the-end iterator.
Objects involved in the operation:
iterator "this" # 0x0x7fff828696e0 {
type = N11__gnu_debug14_Safe_iteratorIN9__gnu_cxx17__normal_iteratorIPiNSt9__cxx19986vectorIiSaIiEEEEENSt7__debug6vectorIiS6_EEEE (mutable iterator);
state = past-the-end;
references sequence with type `NSt7__debug6vectorIiSaIiEEE' # 0x0x7fff82869710
}
123
The above would not happen if we were working with pointers, no matter if we are in debug-mode or not.
If we don't enable debug mode for libstdc++, a more performance friendly version (without the added bookkeeping) implementation will be used - and no diagnostics will be issued.
(Potentially) better Type Safety
Since the actual type of iterators are implementation-defined, this could be used to increase type-safety - but you will have to check the documentation of your implementation to see whether this is the case.
Consider the below example:
#include <vector>
struct A { };
struct B : A { };
// .-- oops
// v
void it_func (std::vector<B>::iterator beg, std::vector<A>::iterator end);
void ptr_func (B * beg, A * end);
// ^-- oops
int
main (int argc, char *argv[])
{
std::vector<B> v1;
it_func (v1.begin (), v1.end ()); // (A)
ptr_func (v1.data (), v1.data () + v1.size ()); // (B)
}
Elaboration
(A) could, depending on the implementation, be a compile-time error since std::vector<A>::iterator and std::vector<B>::iterator potentially isn't of the same type.
(B) would, however, always compile since there's an implicit conversion from B* to A*.
Iterators are intended to provide an abstraction over pointers.
For example, incrementing an iterator always manipulates the iterator so that if there's a next item in the collection, it refers to that next item. If it already referred to the last item in the collection, after the increment it'll be a unique value that can't be dereferenced, but will compare equal to another iterator pointing one past the end of the same collection (usually obtained with collection.end()).
In the specific case of an iterator into a string (or a vector), a pointer provides all the capabilities required of an iterator, so a pointer can be used as an iterator with no loss of required functionality.
For example, you could use std::sort to sort the items in a string or a vector. Since pointers provide the required capabilities, you can also use it to sort items in a native (C-style) array.
At the same time, yes, defining (or using) an iterator that's separate from a pointer can provide extra capabilities that aren't strictly required. Just for example, some iterators provide at least some degree of checking, to assure that (for example) when you compare two iterators, they're both iterators into the same collection, and that you aren't attempting an out of bounds access. A raw pointer can't (or at least normally won't) provide this kind of capability.
Much of this comes back to the "don't pay for what you don't use" mentality. If you really only need and want the capabilities of native pointers, they can be used as iterators, and you'll normally get code that's essentially identical to what you'd get by directly manipulating pointers. At the same time, for cases where you do want extra capabilities, such as traversing a threaded RB-tree or a B+ tree instead of a simple array, iterators allow you to do that while maintaining a single, simple interface. Likewise, for cases where you don't mind paying extra (in terms of storage and/or run-time) for extra safety, you can get that too (and it's decoupled from things like the individual algorithm, so you can get it where you want it without being forced to use it in other places that may, for example, have too critical of timing requirements to support it.
In my opinion, many people kind of miss the point when it comes to iterators. Many people happily rewrite something like:
for (size_t i=0; i<s.size(); i++)
...into something like:
for (std::string::iterator i = s.begin; i != s.end(); i++)
...and act as if it's a major accomplishment. I don't think it is. For a case like this, there's probably little (if any) gain from replacing an integer type with an iterator. Likewise, taking the code you posted and changing char const * to std::string::iterator seems unlikely to accomplish much (if anything). In fact, such conversions often make the code more verbose and less understandable, while gaining nothing in return.
If you were going to change the code, you should (in my opinion) do so in an attempt at making it more versatile by making it truly generic (which std::string::iterator really isn't going to do).
For example, consider your split (copied from the post you linked):
vector<string> split(const char* start, const char* finish){
const char delimiters[] = ",(";
const char* it;
vector<string> result;
do{
for (it = find_first_of(start, finish, begin(delimiters), end(delimiters));
it != finish && *it == '(';
it = find_first_of(extractParenthesis(it, finish) + 1, finish, begin(delimiters), end(delimiters)));
auto&& temp = interpolate(start, it);
result.insert(result.end(), temp.begin(), temp.end());
start = ++it;
} while (it <= finish);
return result;
}
As it stands, this is restricted to being used on narrow strings. If somebody wants to work with wide strings, UTF-32 strings, etc., it's relatively difficult to get it to do that. Likewise, if somebody wanted to match [ or '{' instead of (, the code would need to be rewritten for that as well.
If there were a chance of wanting to support various string types, we might want to make the code more generic, something like this:
template <class InIt, class OutIt, class charT>
void split(InIt start, InIt finish, charT paren, charT comma, OutIt result) {
typedef std::iterator_traits<OutIt>::value_type o_t;
charT delimiters[] = { comma, paren };
InIt it;
do{
for (it = find_first_of(start, finish, begin(delimiters), end(delimiters));
it != finish && *it == paren;
it = find_first_of(extractParenthesis(it, finish) + 1, finish, begin(delimiters), end(delimiters)));
auto&& temp = interpolate(start, it);
*result++ = o_t{temp.begin(), temp.end()};
start = ++it;
} while (it != finish);
}
This hasn't been tested (or even compiled) so it's really just a sketch of a general direction you could take the code, not actual, finished code. Nonetheless, I think the general idea should at least be apparent--we don't just change it to "use iterators". We change it to be generic, and iterators (passed as template parameters, with types not directly specified here) are only a part of that. To get very far, we also eliminated hard-coding the paren and comma characters. Although not strictly necessary, I also change the parameters to fit more closely with the convention used by standard algorithms, so (for example) output is also written via an iterator rather than being returned as a collection.
Although it may not be immediately apparent, the latter does add quite a bit of flexibility. Just for example, if somebody just wanted to print out the strings after splitting them, he could pass an std::ostream_iterator, to have each result written directly to std::cout as it's produced, rather than getting a vector of strings, and then having to separately print them out.
I once landed an interview and was asked what the purpose of assigning a variable by reference would be (as in the following case):
int i = 0;
int &j = i;
My answer was that C++ references work like C pointers, but cannot assume the NULL value, they must always point to a concrete object in memory. Of course, the syntax is different when using references (no need for the pointer indirection operator, and object properties will be accessed via the dot (.) rather than the arrow (->) operator). Perhaps the most important difference, is that unlike with pointers, where you can make a pointer point to something different (even after it was pointing to the same thing as another pointer), with references, if one reference is updated, then the other references which pointed to the same thing are also updated to point to the very same object.
But then I went on to say that the above use of references is pretty useless (and perhaps this is where I went wrong), because I couldn't see a practical advantage to assigning by reference: since both references end up pointing to the same thing, you could easily do with one reference, and couldn't think of a case where this wouldn't be the case. I went on to explain that references are useful as pass-by-reference function parameters, but not in assignments. But the interviewer said they assign by reference in their code all the time, and flunked me (I then went on to work for a company that this company was a client of, but that's besides the point).
Anyways, several years later, I would like to know where I could have gone wrong.
To begin with, I'd hope for that company's sake that wasn't the ONLY reason they didn't hire you, since it's a petty detail (and no, you don't really know exactly why a company doesn't hire you).
As touched on in the comment, references NEVER change what they refer to within their lifetime. Once set, a reference refers to that same location, until it "dies".
Now, references are quite useful to simplify an expression. Say we have a class or structure with a fair amount of complicated content. Say something like this:
struct A
{
int x, y, z;
};
struct B
{
A arr[100];
};
class C
{
public:
void func();
B* list[20];
};
void C::func()
{
...
if (list[i]->arr[j].x == 4 && list[i]->arr[j].y == 5 &&
(list[i]->arr[j].z < 10 || list[i]->arr[j].z > 90))
{
... do stuff ...
}
}
That's a lot of repeats of list[i]->arr[j] in there. So we could rewrite it using a reference:
void C::func()
{
...
A &cur = list[i]->arr[j];
if (cur.x == 4 && cur.y == 5 &&
(cur.z < 10 || cur.z > 90))
{
... do stuff ...
}
}
The above code assumes do stuff is actually mofidying the cur element in some way, if not, you should probably use const A &cur =... instead.
I use this technique quite a bit to make it clearer and less repetitive.
In this particular case of assigning a reference to a local variable of primitive type in the same scope, the assignment is very much useless: there is nothing you can do using j that you could not do using i. There are several mildly negative consequences to it, too, because the readability would suffer, and the optimizer may get confused.
Here is one legitimate use of assigning a reference:
class demo {
private:
map<int,string> cache;
string read_resource(int id) {
string resource_string;
... // Lengthy process for getting a non-empty resource string
return resource_string;
}
public:
string& get_by_id(int id) {
// Here is a nice trick
string &res = cache[id];
if (res.size() == 0) {
// Assigning res modifies the string in the map
res = read_resource(id);
}
return res;
}
};
Above, variable res of reference type refers to an element of the map that is either retrieved, or created new. If the string is created new, the code calls the "real" getter, and assigns its result to res. This automatically updates the cache, too, saving us another lookup in the cache map.
if (find(visitable.begin(), visitable.end(), ourstack.returnTop())) { ... }
I want to determine whether the top character in stack ourstack can be found in the vector visitable. If yes, I want this character to be deleted from visitable.
How would I code that? I know vectors use erase, but that requires the specific location of that character (which I don't know).
This is for my maze-path-finding assignment.
Also, my returnTop is giving me an error: class "std.stack<char..." has no member returnTop. I declared #include in the top of my program. What's happening here?
Thanks in advance!
If you are using find, then you already know the location of the character. find returns an iterator to the position where the character is found, or to the value used as end if it cannot find it.
vector<?>::const_iterator iter =
find(visitable.begin(), visitable.end(), ourstack.top());
if( iter != visitable.end() )
{
visitable.erase( iter );
}
As for stack, the function you are looking for is top(). The standard C++ library does not use camelCased identifiers, that looks more like a Java or C# thing.
Just like this:
// Note assume C++0x notation for simplicity since I don't know the type of the template
auto character = ourstack.top();
auto iter = std::find(visitable.begin(), visitable.end(), character);
if (iter != visitable.end())
visitable.erase(iter);
returnTop does not exist in the stack class, but top does.
Alternatively if you want some generic (and rather flamboyant way) of doing it:
// Assume type of vector and stack are the same
template <class T>
void TryRemoveCharacter(std::vector<T>& visitable, const std::stack<T>& ourStack)
{
// Note, could have passed a ref to the character directly, which IMHO makes more sense
const T& ourChar = ourStack.top();
visitable.erase(std::remove_if(visitable.begin(), visitable.end(), [&ourChar](const T& character)
{
// Note, this will not work http://www.cplusplus.com/reference/algorithm/find/
// says that std::find uses the operator== for comparisons but I doubt that
// as compilers typically do not generate equal comparison operator.
// See http://stackoverflow.com/questions/217911/why-dont-c-compilers-define-operator-and-operator
// It's best to either overload the operator== to do a true comparison or
// add a comparison method and invoke it here.
return ourChar == character;
}));
}
Note: this alternative way may not be a good idea for an assignment as your teacher will probably find suspicious that you introduce advanced C++ features (C++0x) all of a sudden.
However for intellectual curiosity it could work ;)
Here's how you may use it:
TryRemoveCharacter(visitable, ourstack);
I'm trying to get my head around tuples (thanks #litb), and the common suggestion for their use is for functions returning > 1 value.
This is something that I'd normally use a struct for , and I can't understand the advantages to tuples in this case - it seems an error-prone approach for the terminally lazy.
Borrowing an example, I'd use this
struct divide_result {
int quotient;
int remainder;
};
Using a tuple, you'd have
typedef boost::tuple<int, int> divide_result;
But without reading the code of the function you're calling (or the comments, if you're dumb enough to trust them) you have no idea which int is quotient and vice-versa. It seems rather like...
struct divide_result {
int results[2]; // 0 is quotient, 1 is remainder, I think
};
...which wouldn't fill me with confidence.
So, what are the advantages of tuples over structs that compensate for the ambiguity?
tuples
I think i agree with you that the issue with what position corresponds to what variable can introduce confusion. But i think there are two sides. One is the call-side and the other is the callee-side:
int remainder;
int quotient;
tie(quotient, remainder) = div(10, 3);
I think it's crystal clear what we got, but it can become confusing if you have to return more values at once. Once the caller's programmer has looked up the documentation of div, he will know what position is what, and can write effective code. As a rule of thumb, i would say not to return more than 4 values at once. For anything beyond, prefer a struct.
output parameters
Output parameters can be used too, of course:
int remainder;
int quotient;
div(10, 3, "ient, &remainder);
Now i think that illustrates how tuples are better than output parameters. We have mixed the input of div with its output, while not gaining any advantage. Worse, we leave the reader of that code in doubt on what could be the actual return value of div be. There are wonderful examples when output parameters are useful. In my opinion, you should use them only when you've got no other way, because the return value is already taken and can't be changed to either a tuple or struct. operator>> is a good example on where you use output parameters, because the return value is already reserved for the stream, so you can chain operator>> calls. If you've not to do with operators, and the context is not crystal clear, i recommend you to use pointers, to signal at the call side that the object is actually used as an output parameter, in addition to comments where appropriate.
returning a struct
The third option is to use a struct:
div_result d = div(10, 3);
I think that definitely wins the award for clearness. But note you have still to access the result within that struct, and the result is not "laid bare" on the table, as it was the case for the output parameters and the tuple used with tie.
I think a major point these days is to make everything as generic as possible. So, say you have got a function that can print out tuples. You can just do
cout << div(10, 3);
And have your result displayed. I think that tuples, on the other side, clearly win for their versatile nature. Doing that with div_result, you need to overload operator<<, or need to output each member separately.
Another option is to use a Boost Fusion map (code untested):
struct quotient;
struct remainder;
using boost::fusion::map;
using boost::fusion::pair;
typedef map<
pair< quotient, int >,
pair< remainder, int >
> div_result;
You can access the results relatively intuitively:
using boost::fusion::at_key;
res = div(x, y);
int q = at_key<quotient>(res);
int r = at_key<remainder>(res);
There are other advantages too, such as the ability to iterate over the fields of the map, etc etc. See the doco for more information.
With tuples, you can use tie, which is sometimes quite useful: std::tr1::tie (quotient, remainder) = do_division ();. This is not so easy with structs. Second, when using template code, it's sometimes easier to rely on pairs than to add yet another typedef for the struct type.
And if the types are different, then a pair/tuple is really no worse than a struct. Think for example pair<int, bool> readFromFile(), where the int is the number of bytes read and bool is whether the eof has been hit. Adding a struct in this case seems like overkill for me, especially as there is no ambiguity here.
Tuples are very useful in languages such as ML or Haskell.
In C++, their syntax makes them less elegant, but can be useful in the following situations:
you have a function that must return more than one argument, but the result is "local" to the caller and the callee; you don't want to define a structure just for this
you can use the tie function to do a very limited form of pattern matching "a la ML", which is more elegant than using a structure for the same purpose.
they come with predefined < operators, which can be a time saver.
I tend to use tuples in conjunction with typedefs to at least partially alleviate the 'nameless tuple' problem. For instance if I had a grid structure then:
//row is element 0 column is element 1
typedef boost::tuple<int,int> grid_index;
Then I use the named type as :
grid_index find(const grid& g, int value);
This is a somewhat contrived example but I think most of the time it hits a happy medium between readability, explicitness, and ease of use.
Or in your example:
//quotient is element 0 remainder is element 1
typedef boost:tuple<int,int> div_result;
div_result div(int dividend,int divisor);
One feature of tuples that you don't have with structs is in their initialization. Consider something like the following:
struct A
{
int a;
int b;
};
Unless you write a make_tuple equivalent or constructor then to use this structure as an input parameter you first have to create a temporary object:
void foo (A const & a)
{
// ...
}
void bar ()
{
A dummy = { 1, 2 };
foo (dummy);
}
Not too bad, however, take the case where maintenance adds a new member to our struct for whatever reason:
struct A
{
int a;
int b;
int c;
};
The rules of aggregate initialization actually mean that our code will continue to compile without change. We therefore have to search for all usages of this struct and updating them, without any help from the compiler.
Contrast this with a tuple:
typedef boost::tuple<int, int, int> Tuple;
enum {
A
, B
, C
};
void foo (Tuple const & p) {
}
void bar ()
{
foo (boost::make_tuple (1, 2)); // Compile error
}
The compiler cannot initailize "Tuple" with the result of make_tuple, and so generates the error that allows you to specify the correct values for the third parameter.
Finally, the other advantage of tuples is that they allow you to write code which iterates over each value. This is simply not possible using a struct.
void incrementValues (boost::tuples::null_type) {}
template <typename Tuple_>
void incrementValues (Tuple_ & tuple) {
// ...
++tuple.get_head ();
incrementValues (tuple.get_tail ());
}
Prevents your code being littered with many struct definitions. It's easier for the person writing the code, and for other using it when you just document what each element in the tuple is, rather than writing your own struct/making people look up the struct definition.
Tuples will be easier to write - no need to create a new struct for every function that returns something. Documentation about what goes where will go to the function documentation, which will be needed anyway. To use the function one will need to read the function documentation in any case and the tuple will be explained there.
I agree with you 100% Roddy.
To return multiple values from a method, you have several options other than tuples, which one is best depends on your case:
Creating a new struct. This is good when the multiple values you're returning are related, and it's appropriate to create a new abstraction. For example, I think "divide_result" is a good general abstraction, and passing this entity around makes your code much clearer than just passing a nameless tuple around. You could then create methods that operate on the this new type, convert it to other numeric types, etc.
Using "Out" parameters. Pass several parameters by reference, and return multiple values by assigning to the each out parameter. This is appropriate when your method returns several unrelated pieces of information. Creating a new struct in this case would be overkill, and with Out parameters you emphasize this point, plus each item gets the name it deserves.
Tuples are Evil.
I have an array of constant data like following:
enum Language {GERMAN=LANG_DE, ENGLISH=LANG_EN, ...};
struct LanguageName {
ELanguage language;
const char *name;
};
const Language[] languages = {
GERMAN, "German",
ENGLISH, "English",
.
.
.
};
When I have a function which accesses the array and find the entry based on the Language enum parameter. Should I write a loop to find the specific entry in the array or are there better ways to do this.
I know I could add the LanguageName-objects to an std::map but wouldn't this be overkill for such a simple problem? I do not have an object to store the std::map so the map would be constructed for every call of the function.
What way would you recommend?
Is it better to encapsulate this compile time constant array in a class which handles the lookup?
If the enum values are contiguous starting from 0, use an array with the enum as index.
If not, this is what I usually do:
const char* find_language(Language lang)
{
typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;
static const lang_map_entry_type lang_map_entries[] = { /*...*/ }
static const lang_map_type lang_map( lang_map_entries
, lang_map_entries + sizeof(lang_map_entries)
/ sizeof(lang_map_entries[0]) );
lang_map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
If you consider a map for constants, always also consider using a vector.
Function-local statics are a nice way to get rid of a good part of the dependency problems of globals, but are dangerous in a multi-threaded environment. If you're worried about that, you might rather want to use globals:
typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;
const lang_map_entry_type lang_map_entries[] = { /*...*/ }
const lang_map_type lang_map( lang_map_entries
, lang_map_entries + sizeof(lang_map_entries)
/ sizeof(lang_map_entries[0]) );
const char* find_language(Language lang)
{
lang_map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
There are three basic approaches that I'd choose from. One is the switch statement, and it is a very good option under certain conditions. Remember - the compiler is probably going to compile that into an efficient table-lookup for you, though it will be looking up pointers to the case code blocks rather than data values.
Options two and three involve static arrays of the type you are using. Option two is a simple linear search - which you are (I think) already doing - very appropriate if the number of items is small.
Option three is a binary search. Static arrays can be used with standard library algorithms - just use the first and first+count pointers in the same way that you'd use begin and end iterators. You will need to ensure the data is sorted (using std::sort or std::stable_sort), and use std::lower_bound to do the binary search.
The complication in this case is that you'll need a comparison function object which acts like operator< with a stored or referenced value, but which only looks at the key field of your struct. The following is a rough template...
class cMyComparison
{
private:
const fieldtype& m_Value; // Note - only storing a reference
public:
cMyComparison (const fieldtype& p_Value) : m_Value (p_Value) {}
bool operator() (const structtype& p_Struct) const
{
return (p_Struct.field < m_Value);
// Warning : I have a habit of getting this comparison backwards,
// and I haven't double-checked this
}
};
This kind of thing should get simpler in the next C++ standard revision, when IIRC we'll get anonymous functions (lambdas) and closures.
If you can't put the sort in your apps initialisation, you might need an already-sorted boolean static variable to ensure you only sort once.
Note - this is for information only - in your case, I think you should either stick with linear search or use a switch statement. The binary search is probably only a good idea when...
There are a lot of data items to search
Searches are done very frequently (many times per second)
The key enumerate values are sparse (lots of big gaps) - otherwise, switch is better.
If the coding effort were trivial, it wouldn't be a big deal, but C++ currently makes this a bit harder than it should be.
One minor note - it may be a good idea to define an enumerate for the size of your array, and to ensure that your static array declaration uses that enumerate. That way, your compiler should complain if you modify the table (add/remove items) and forget to update the size enum, so your searches should never miss items or go out of bounds.
I think you have two questions here:
What is the best way to store a constant global variable (with possible Multi-Threaded access) ?
How to store your data (which container use) ?
The solution described by sbi is elegant, but you should be aware of 2 potential problems:
In case of Multi-Threaded access, the initialization could be skrewed.
You will potentially attempt to access this variable after its destruction.
Both issues on the lifetime of static objects are being covered in another thread.
Let's begin with the constant global variable storage issue.
The solution proposed by sbi is therefore adequate if you are not concerned by 1. or 2., on any other case I would recommend the use of a Singleton, such as the ones provided by Loki. Read the associated documentation to understand the various policies on lifetime, it is very valuable.
I think that the use of an array + a map seems wasteful and it hurts my eyes to read this. I personally prefer a slightly more elegant (imho) solution.
const char* find_language(Language lang)
{
typedef std::map<Language, const char*> map_type;
typedef lang_map_type::value_type value_type;
// I'll let you work out how 'my_stl_builder' works,
// it makes for an interesting exercise and it's easy enough
// Note that even if this is slightly slower (?), it is only executed ONCE!
static const map_type = my_stl_builder<map_type>()
<< value_type(GERMAN, "German")
<< value_type(ENGLISH, "English")
<< value_type(DUTCH, "Dutch")
....
;
map_type::const_iterator it = lang_map.find(lang);
if( it == lang_map.end() ) return NULL;
return it->second;
}
And now on to the container type issue.
If you are concerned about performance, then you should be aware that for small data collection, a vector of pairs is normally more efficient in look ups than a map. Once again I would turn toward Loki (and its AssocVector), but really I don't think that you should worry about performance.
I tend to choose my container depending on the interface I am likely to need first and here the map interface is really what you want.
Also: why do you use 'const char*' rather than a 'std::string'?
I have seen too many people using a 'const char*' like a std::string (like in forgetting that you have to use strcmp) to be bothered by the alleged loss of memory / performance...
It depends on the purpose of the array. If you plan on showing the values in a list (for a user selection, perhaps) the array would be the most efficient way of storing them. If you plan on frequently looking up values by their enum key, you should look into a more efficient data structure like a map.
There is no need to write a loop. You can use the enum value as index for the array.
I would make an enum with sequential language codes
enum { GERMAN=0, ENGLISH, SWAHILI, ENOUGH };
The put them all into array
const char *langnames[] = {
"German", "English", "Swahili"
};
Then I would check if sizeof(langnames)==sizeof(*langnames)*ENOUGH in debug build.
And pray that I have no duplicates or swapped languages ;-)
If you want fast and simple solution , Can try like this
enum ELanguage {GERMAN=0, ENGLISH=1};
static const string Ger="GERMAN";
static const string Eng="ENGLISH";
bool getLanguage(const ELanguage& aIndex,string & arName)
{
switch(aIndex)
{
case GERMAN:
{
arName=Ger;
return true;
}
case ENGLISH:
{
arName=Eng;
}
default:
{
// Log Error
return false;
}
}
}