C/C++ switch case with string [duplicate] - c++

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
C/C++: switch for non-integers
Hi,
I need to use a string in switch case. My solution so far was to calculate the hash of the string with my hash function. Problem is I have to manually pre-calculate all my hash values for strings. Is there a better approach?
h=_myhash (mystring);
switch (h)
{
case 66452:
.......
case 1342537:
........
}

Just use a if() { } else if () { } chain. Using a hash value is going to be a maintenance nightmare. switch is intended to be a low-level statement which would not be appropriate for string comparisons.

You could map the strings to function pointer using a standard collection; executing the function when a match is found.
EDIT: Using the example in the article I gave the link to in my comment, you can declare a function pointer type:
typedef void (*funcPointer)(int);
and create multiple functions to match the signature:
void String1Action(int arg);
void String2Action(int arg);
The map would be std::string to funcPointer:
std::map<std::string, funcPointer> stringFunctionMap;
Then add the strings and function pointers:
stringFunctionMap.add("string1", &String1Action);
I've not tested any of the code I have just posted, it's off the top of my head :)

Typically, you would use a hash table and function object, both available in Boost, TR1 and C++0x.
void func1() {
}
std::unordered_map<std::string, std::function<void()>> hash_map;
hash_map["Value1"] = &func1;
// .... etc
hash_map[mystring]();
This is a little more overhead at runtime but a bajillion times more maintainable. Hash tables offer O(1) insertion, lookup, and etc, which makes them the same complexity as the assembly-style jump-table.

The best way is to use source generation, so that you could use
if (hash(str) == HASH("some string") ..
in your main source, and an pre-build step would convert the HASH(const char*) expression to an integer value.

You could create a hashtable. The keys can be the string and the value can be and integer. Setup your integers for the values as constants and then you can check for them with the switch.

Ruslik's suggestion to use source generation seems like a good thing to me. However, I wouldn't go with the concept of "main" and "generated" source files. I'd rather have one file with code almost identical to yours:
h=_myhash (mystring);
switch (h)
{
case 66452: // = hash("Vasia")
.......
case 1342537: // = hash("Petya")
........
}
The next thing I'd do, I'd write a simple script. Perl is good for such kind of things, but nothing stops you even from writing a simple program in C/C++ if you don't want to use any other languages. This script, or program, would take the source file, read it line-by-line, find all those case NUMBERS: // = hash("SOMESTRING") lines (use regular expressions here), replace NUMBERS with the actual hash value and write the modified source into a temporary file. Finally, it would back up the source file and replace it with the temporary file. If you don't want your source file to have a new time stamp each time, the program could check if something was actually changed and if not, skip the file replacement.
The last thing to do is to integrate this script into the build system used, so you won't accidentally forget to launch it before building the project.

You could use the string to index into a hash table of function pointers.
Edit: glib has a hash table implementation that supports strings as keys and arbitrary pointers as values: http://library.gnome.org/devel/glib/stable/glib-Hash-Tables.html

You can use enumeration and a map, so your string will become the key and enum value is value for that key.

If you are after performance and don't want to go through all the if clauses each time if there are many or the need to hash the values, you could send some extra information to the function with the help of enum or just add an enum type to your structure.

There is no good solution to your problem, so here is an okey solution ;-)
It keeps your efficiency when assertions are disabled and when assertions are enabled it will raise an assertion error when the hash value is wrong.
I suspect that the D programming language could compute the hash value during compile time, thus removing the need to explicitly write down the hash value.
template <std::size_t h>
struct prehash
{
const your_string_type str;
static const std::size_t hash_value = h;
pre_hash(const your_string_type& s) : str(s)
{
assert(_myhash(s) == hash_value);
}
};
/* ... */
std::size_t h = _myhash(mystring);
static prehash<66452> first_label = "label1";
switch (h) {
case first_label.hash_value:
// ...
;
}
By the way, consider removing the initial underscore from the declaration of _ myhash() (sorry but stackoverflow forces me to insert a space between _ and myhash). A C++ implementation is free to implement macros with names starting with underscore and an uppercase letter (Item 36 of "Exceptional C++ Style" by Herb Sutter), so if you get into the habit of giving things names that start underscore, then a beautiful day could come when you give a symbol a name that starts with underscore and an uppercase letter, where the implementation has defined a macro with the same name.

Related

Efficient way to check if string contains value of vector<string>?

I'm pretty new to C++ programming but for certain reasons I need to develop a small tool in C++. I've written the same tool in C# already. Right now I'm trying to check if my string contains a value that is stored in a std::vector. In C# this is pretty straight forward, simply using something like this:
if(string.Contains(myarray)) { // do sth. }
In C++ this seems way harder to achieve. I googled quite a bit but so far I found only solutions to check if the WHOLE string exists in an array and not just a certain part of it.
Unfortunately std::string does not have a method that can see if a element of a vector is in a string like C# does. What it does have though is std::string::find which can determine if a string is contained within the string you call find on. You could use that like
std::vector<std::string> words;
// fill words
std::string search_me = "some text";
for (const auto & e : words)
{
if (search_me.find(e) != std::string::npos)
{
// e is contained in search me. do something here
// call break here if you only want to check for one existence
}
}
This is O(N*complexity_of_find).
Use a for loop and the find method.
I would suggest std::find_first_of
Not sure if I understood your exact problem, though. Could you give a small example of what your are trying to find in what?
If you need more effective way to find several substrings in string than straightforward find string-by-string, you can use Aho-Corasick algorithm
It uses trie to hold substrings. First google link to c++ implementation

(C++/ QT) Use a switch statement with strings by turning the strings into an int [duplicate]

This question already has answers here:
How to write a switch statement for strings in Qt?
(3 answers)
Closed 6 years ago.
At this website I found an interesting way to create a switch statement with strings. However, this seems really long and drawn out. I wanted to know if it's possible to turn a particular string into an integer that can be used in a switch statement.
So psuedo code would be something like this
QString str = "spinbox_1";
switch stoi( str )
case 1543:
//code here
case 2343:
//code here
case 3424:
//code here
As #Slava mentioned it is not easily possible. The solution provided by author in mentioned link is probably the most practtical solution. But if you for some reason really need to do it other way and convert string into decimal number, you can use hashing metod. Please refer to below cityhash which is widely used (obviously you can use any other hashing function).
https://github.com/google/cityhash
This may be duplicate of:
How can I hash a string to an int using c++?
Try to look at this solution:
https://github.com/Efrit/str_switch/blob/master/str_switch.h
Unfortunately the description of this solution is avaliable only in Russian (at least I can't find one in English). It is based on computing hash of the string in compile-time. The only limitation it has is it supports strings with 9 character maximum length.
If I ever find myself in a similar situation, I use a map to define a specific int from the given string.
For Example:
// The string you want to convert to an int
std::string myString = "stringTwo";
// The mapping that you set for string to int conversions
std::map<std::string, int> stringToInt = \
{{"stringOne" , 1},
{"stringTwo" , 2},
{"stringThree", 3}};
// Here, myInt is define as 2
int myInt = stringToInt[myString];
Now you could put myInt into a switch case.
No, it is not possible to map a string to an integer uniquely in general - there are simply more strings than integers. You may calculate hash which unlikely to collide for 2 different string and then compare them, but it is still possibility that 2 different strings have the same hash, so you have to compare that strings after you check their hashes. This is how std::unordered_set or std::unordered_map are implemented (aka hash_set and hash_map) so you can use them. But you would not use switch() statement.

C++ Assigning variables with 'compound names' using an external argument

I'm trying to read a .pdb file and hence I'm ending with a lot of variables in my code. In an effort to reduce them (and avoid Segmentation fault errors) I was wondering if I could assign array names in my code using an external argument.
The starting bit of my code foo.cpp looks like this-
/*All the relevant headers*/
using namespace std ;
int main(int argc, char *argv[])
{
ifstream input(argv[1],ios::out) ;
string first(argv[2]) ;
string second(argv[3]) ;
string "first"ATOM[1000] ;
string "second"ATOM[1000] ;
}
And I'm hoping that if I launch the program as ./foo.exe input C O, I want two arrays called CATOM and OATOM to be initialised.
If there is no second argument then the OATOM array should not get defined.
This would save me the trouble of having to make multiple arrays such as NATOM[1000], OATOM[1000] etc. since I can define them within the program.
Is this possible? For each 'O', 'C', 'N' etc there need to be about 8-10 long string arrays which is causing it to blow up.
I'm new to programming and I hope this question makes sense.
Thanks in advance!
I suggest creating a struct with array and a string variable containing the name of that array and then you just search the structs by name.
A more elegant solution is using std::map like #NathanOliver suggested. Runtime changes of variable names are not possible (or logical) within c++ as far as I know.
It is not possible to change or set variable names at run time.
However, map (also known as dictionary or associative array) is a data structure that allows you to associate key objects (such as a string) to value objects (such as an array) and it possibly fits your needs. There is an implementation of map in the standard library, that you can use.

Named parameter string formatting in C++

I'm wondering if there is a library like Boost Format, but which supports named parameters rather than positional ones. This is a common idiom in e.g. Python, where you have a context to format strings with that may or may not use all available arguments, e.g.
mouse_state = {}
mouse_state['button'] = 0
mouse_state['x'] = 50
mouse_state['y'] = 30
#...
"You clicked %(button)s at %(x)d,%(y)d." % mouse_state
"Targeting %(x)d, %(y)d." % mouse_state
Are there any libraries that offer the functionality of those last two lines? I would expect it to offer a API something like:
PrintFMap(string format, map<string, string> args);
In Googling I have found many libraries offering variations of positional parameters, but none that support named ones. Ideally the library has few dependencies so I can drop it easily into my code. C++ won't be quite as idiomatic for collecting named arguments, but probably someone out there has thought more about it than me.
Performance is important, in particular I'd like to keep memory allocations down (always tricky in C++), since this may be run on devices without virtual memory. But having even a slow one to start from will probably be faster than writing it from scratch myself.
The fmt library supports named arguments:
print("You clicked {button} at {x},{y}.",
arg("button", "b1"), arg("x", 50), arg("y", 30));
And as a syntactic sugar you can even (ab)use user-defined literals to pass arguments:
print("You clicked {button} at {x},{y}.",
"button"_a="b1", "x"_a=50, "y"_a=30);
For brevity the namespace fmt is omitted in the above examples.
Disclaimer: I'm the author of this library.
I've always been critic with C++ I/O (especially formatting) because in my opinion is a step backward in respect to C. Formats needs to be dynamic, and makes perfect sense for example to load them from an external resource as a file or a parameter.
I've never tried before however to actually implement an alternative and your question made me making an attempt investing some weekend hours on this idea.
Sure the problem was more complex than I thought (for example just the integer formatting routine is 200+ lines), but I think that this approach (dynamic format strings) is more usable.
You can download my experiment from this link (it's just a .h file) and a test program from this link (test is probably not the correct term, I used it just to see if I was able to compile).
The following is an example
#include "format.h"
#include <iostream>
using format::FormatString;
using format::FormatDict;
int main()
{
std::cout << FormatString("The answer is %{x}") % FormatDict()("x", 42);
return 0;
}
It is different from boost.format approach because uses named parameters and because
the format string and format dictionary are meant to be built separately (and for
example passed around). Also I think that formatting options should be part of the
string (like printf) and not in the code.
FormatDict uses a trick for keeping the syntax reasonable:
FormatDict fd;
fd("x", 12)
("y", 3.141592654)
("z", "A string");
FormatString is instead just parsed from a const std::string& (I decided to preparse format strings but a slower but probably acceptable approach would be just passing the string and reparsing it each time).
The formatting can be extended for user defined types by specializing a conversion function template; for example
struct P2d
{
int x, y;
P2d(int x, int y)
: x(x), y(y)
{
}
};
namespace format {
template<>
std::string toString<P2d>(const P2d& p, const std::string& parms)
{
return FormatString("P2d(%{x}; %{y})") % FormatDict()
("x", p.x)
("y", p.y);
}
}
after that a P2d instance can be simply placed in a formatting dictionary.
Also it's possible to pass parameters to a formatting function by placing them between % and {.
For now I only implemented an integer formatting specialization that supports
Fixed size with left/right/center alignment
Custom filling char
Generic base (2-36), lower or uppercase
Digit separator (with both custom char and count)
Overflow char
Sign display
I've also added some shortcuts for common cases, for example
"%08x{hexdata}"
is an hex number with 8 digits padded with '0's.
"%026/2,8:{bindata}"
is a 24-bit binary number (as required by "/2") with digit separator ":" every 8 bits (as required by ",8:").
Note that the code is just an idea, and for example for now I just prevented copies when probably it's reasonable to allow storing both format strings and dictionaries (for dictionaries it's however important to give the ability to avoid copying an object just because it needs to be added to a FormatDict, and while IMO this is possible it's also something that raises non-trivial problems about lifetimes).
UPDATE
I've made a few changes to the initial approach:
Format strings can now be copied
Formatting for custom types is done using template classes instead of functions (this allows partial specialization)
I've added a formatter for sequences (two iterators). Syntax is still crude.
I've created a github project for it, with boost licensing.
The answer appears to be, no, there is not a C++ library that does this, and C++ programmers apparently do not even see the need for one, based on the comments I have received. I will have to write my own yet again.
Well I'll add my own answer as well, not that I know (or have coded) such a library, but to answer to the "keep the memory allocation down" bit.
As always I can envision some kind of speed / memory trade-off.
On the one hand, you can parse "Just In Time":
class Formater:
def __init__(self, format): self._string = format
def compute(self):
for k,v in context:
while self.__contains(k):
left, variable, right = self.__extract(k)
self._string = left + self.__replace(variable, v) + right
This way you don't keep a "parsed" structure at hand, and hopefully most of the time you'll just insert the new data in place (unlike Python, C++ strings are not immutable).
However it's far from being efficient...
On the other hand, you can build a fully constructed tree representing the parsed format. You will have several classes like: Constant, String, Integer, Real, etc... and probably some subclasses / decorators as well for the formatting itself.
I think however than the most efficient approach would be to have some kind of a mix of the two.
explode the format string into a list of Constant, Variable
index the variables in another structure (a hash table with open-addressing would do nicely, or something akin to Loki::AssocVector).
There you are: you're done with only 2 dynamically allocated arrays (basically). If you want to allow a same key to be repeated multiple times, simply use a std::vector<size_t> as a value of the index: good implementations should not allocate any memory dynamically for small sized vectors (VC++ 2010 doesn't for less than 16 bytes worth of data).
When evaluating the context itself, look up the instances. You then parse the formatter "just in time", check it agaisnt the current type of the value with which to replace it, and process the format.
Pros and cons:
- Just In Time: you scan the string again and again
- One Parse: requires a lot of dedicated classes, possibly many allocations, but the format is validated on input. Like Boost it may be reused.
- Mix: more efficient, especially if you don't replace some values (allow some kind of "null" value), but delaying the parsing of the format delays the reporting of errors.
Personally I would go for the One Parse scheme, trying to keep the allocations down using boost::variant and the Strategy Pattern as much I could.
Given that Python it's self is written in C and that formatting is such a commonly used feature, you might be able (ignoring copy write issues) to rip the relevant code from the python interpreter and port it to use STL maps rather than Pythons native dicts.
I've writen a library for this puporse, check it out on GitHub.
Contributions are wellcome.

Most Efficient way to 'look up' Keywords

Alright so I am writing a function as part of a lexical analyzer that 'looks up' or searches for a match with a keyword. My lexer catches all the obvious tokens such as single and multi character operators (+ - * / > < = == etc) (also comments and whitespace are already taken out) so I call a function after I've collected a stream of only alphanumeric characters (including underscores) into a string, this string then needs to be matched as either a known keyword or an identifier.
So I was wondering how I might go about identifying it? I know I basically need to compare it to some list or array or something of all the built in keywords, and if it matches one return that match to it's corresponding enum value; otherwise, if there is no match, then it must be a function or variable identifier. So how should I look for matches? I read somewhere that something called a Binary Search Tree is an efficient way to do it or by using Hash Tables, problem is I've never used either so I am not sure if it's the right way. Could I possibly use a MySQL database?
If your set of keywords is fixed, a perfect hash can be built for O(1) lookup. Check out gperf or cmph.
A "trie" will surely be the most efficient way.
Whatever implementation of std::map you have will probably be sufficient.
This is for a language, with a specific set of keywords that never change, and there aren't very many of them?
If so, it probably doesn't matter what you use. You will have bigger fish to fry.
However, since the list doesn't change, it would be hard to beat a hard coded search like this:
// search on first letter
switch(s[0]){
case 'a':
// search on 2nd letter, etc.
break;
case 'b':
// search on 2nd letter, etc.
break;
........
case '_':
// search on 2nd letter, etc.
break;
}
For singe character keywords a lookup table would be perfect. For multicharacter (especially if the lengths differs): a hash table. If you need performance, you could even use source code generation to create the hash tables (using a simple hash function that is able or not to ignore case, depending on your syntax).
So I'd implement it with a LUT and a hash table: first you check the first character with the LUT (if it's a simple operator, it would start with a non-alpha-numeric value), and, if not found, check the hash table.