Efficient way to compare more 50 strings - c++

I have method which takes two parameters one as string and other as int.
The string has to compare with more than 50 string and Once the match is found int value need to be mapped with hard coded string as Example below
EX:
string Compare_Method(std::string str, int val) {
if(str == "FIRST")
{
std::array<std::string, 3> real_value = {"Hello1","hai1","bye1"}
return real_value[val];
}
else if(str == "SECOND")
{
std::array<std::string, 4> real_value = {"Hello2","hai2","bye2"}
return real_value[val];
}
else if(str == "THIRD")
{
std::array<std::string, 5> real_value = {"Hello3","hai3","bye3"}
return real_value[val];
}
//----- 50+ else if
}
My approach is as above. What will be the efficient way
1.To compare more than 50 string.
2. create std::array for each if case
EDITED : std::array size is not fixed it can be 3,4,5 as edited above.

This would be my way of doing that. The data structure is created only once and the access times should be fast enough
#include <iostream>
#include <string>
#include <array>
#include <unordered_map>
std::string Compare_Method(const std::string& str, int val)
{
// or std::vector<std::string>
static std::unordered_map<std::string, std::array<std::string, 3>> map
{
{ "FIRST", { "Hello1", "hail1", "bye1" }},
{ "SECOND", { "Hello2", "hail2", "bye2" }},
{ "THIRD", { "Hello3", "hail3", "bye3" }},
// 50+ more
};
// maybe check if str is present in the map
return map[str][val];
}
int main()
{
std::cout << Compare_Method("SECOND", 1) << std::endl;
}
If std::unordered_map isn't (fast) enough for you, you can come up with some sort of static optimal hash structure, since keys are known at compile time.

If those 50 strings are something you will be widely using throughout your program, string comparisons will take a toll on performance. I'd suggest you to adapt them to an enum.
enum Strings
{
FIRST,
SECOND,
THIRD,
…
…
}
You'll obviously need a method to convert string to int whenever you get one from the source (user input, file read, etc.). This should be as infrequent as possible since your system now works on enum values (which can be used as indices on STL containers as we see in the next step)
int GetEnumIndex(const std::string& str)
{
// here you can map all variants of the same string to the same number
if ("FIRST" == str || "first" == str) return 1;
…
}
Then, the comparison method can be based on the enum instead of the string:
std::string Compare_Method(const int& strIndex, int val)
{
static std::vector<std::vector<std::string>> stringArray
{
{ "Hello1", "hail1", "bye1" },
{ "Hello2", "hail2", "bye2", "aloha2" },
{ "Hello3", "hail3", "bye3", "aloha3", "tata3" },
…
};
return stringArray[strIndex][val];
}

With information provided by you, I tried various variations to find out best way to achieve objective. I am listing best one here. You can see other methods here.
You can compile it and run run.sh to compare performance of all cases.
std::string Method6(const std::string &str, int val) {
static std::array<std::string, 5> NUMBERS{"FIRST", "SECOND", "THIRD",
"FOURTH", "FIFTH"};
static std::array<std::vector<std::string>, 5> VALUES{
std::vector<std::string>{"FIRST", "hai1", "bye1"},
std::vector<std::string>{"Hello1", "SECOND", "bye1"},
std::vector<std::string>{"Hello1", "hai1", "THIRD"},
std::vector<std::string>{"FOURTH", "hai1", "bye1"},
std::vector<std::string>{"Hello1", "FIFTH", "bye1"}};
for (int i = 0; i < NUMBERS.size(); ++i) {
if (NUMBERS[i] == str) {
return VALUES[i][val];
}
}
return "";
}
For simplicity I have been using NUMBERS with length of 5 but you can use what ever length you want to.
VALUES is std::array of std::vector so you can add any number if element to std::vector.
output from github code.
Method1 880
Method2 851
Method3 7292
Method4 989
Method5 598
Method6 440
You output may be different based on you system and system load at the time of execution.

Related

format string with a variable size vector of arguments (e.g. pass vector of arguments to std::snprintf)

I'm looking for a way to format a string with a variable-size vector of variables. What do you suggest is the best way of doing this?
I already know about std::snprintf and std::vsnprintf but unfortunately none works out of the box for my problem. Also a solution with recursive templates wont work for me because I can't rely on the input format being fully defined at compile time.
Here is a sample interface for the function I'm trying to implement.
std::string format_variable_size(const char* format, const std::vector<int>& in) {
std::string out{};
....
return out;
}
Example input and output:
const char* format = "My first int is %d, my second int is: %d, my float is: %d";
std::vector<int> in = {1,2,3};
the format_variable_size would return
out = "My first int is 1, my second int is: 2, my float is: 3"
Another example:
const char* format = "My first int is %d, my second int is: %d";
std::vector<int> in = {1,2};
the format_variable_size would return
"My first int is 1, my second int is: 2"
Thanks,
If you have nothing against using fmt, I think the following might work :
#include <numeric>
std::string format_variable_size(const char* fmt, std::vector<int> args){
return std::accumulate(
std::begin(args),
std::end(args),
std::string{fmt},
[](std::string toFmt, int arg){
return fmt::format(toFmt, arg);
}
);
}
std::vector<int> v = {1,2,3};
std::cout << format_variable_size("[{}, {}, {}]\n", v);
If the only specifier you use is %d, then you can easily do a loop and manually replace with the value coming from the vector. Alternatively, you might consider defining your own replacement token (for example ###) to simplify parsing.
Also, if you can live with relatively small vector size (say maximum numbers), you could simply do something like this:
std::vector<int> copy(in);
copy.resize(10);
std::snprintf(buffer, buffer_size,
copy[0], copy[1], copy[2], copy[3], copy[4],
copy[5], copy[6], copy[7], copy[8], copy[9]);
If you format string contains less %d than the size of passed vector, then only the first ones will be outputted.
If the size match exactly, you get the expected result.
If the number of %d is greater than the input vector and up to the size of the copy, extra %d will be outputted with 0.
If you have too much %d, then the behavior is undefined.
If the format string is internal, this might be acceptable but if it come from user or a file, it is better to validate the string (in which case, manually processing might be attractive)
Not pretty solution, since we can't get std::vector's size at compile time:
template <std::size_t... I>
std::string
format_variable_size_impl(const char* format, const std::vector<int>& in,
std::index_sequence<I...>)
{
// Determine the necessary buffer size
auto size = std::snprintf(nullptr, 0, format, in[I]...);
std::string out(size + 1, 0);
std::sprintf(out.data(), format, in[I]...);
return out;
}
std::string
format_variable_size(const char* format, const std::vector<int>& in)
{
if (in.size() == 0)
return format;
if (in.size() == 1)
return format_variable_size_impl(format, in, std::make_index_sequence<1>{});
if (in.size() == 2)
return format_variable_size_impl(format, in, std::make_index_sequence<2>{});
if (in.size() == 3)
return format_variable_size_impl(format, in, std::make_index_sequence<3>{});
if (in.size() == 4)
return format_variable_size_impl(format, in, std::make_index_sequence<4>{});
if (in.size() == 5)
return format_variable_size_impl(format, in, std::make_index_sequence<5>{});
// ...
}

How to cut off parts of a string, which every string in a collection has

My currently problem is the following:
I have a std::vector of full path names to files.
Now i want to cut off the common prefix of all string.
Example
If I have these 3 strings in the vector:
/home/user/foo.txt
/home/user/bar.txt
/home/baz.txt
I would like to cut off /home/ from every string in the vector.
Question
Is there any method to achieve this in general?
I want an algorithm that drops the common prefix of all string.
I currently only have an idea which solves this problem in O(n m) with n strings and m is the longest string length, by just going through every string with every other string char by char.
Is there a faster or more elegant way solving this?
This can be done entirely with std:: algorithms.
synopsis:
sort the input range if not already sorted. The first and last paths in the sorted range
will be the most dissimilar. Best case is O(N), worst case O(N + N.logN)
use std::mismatch to determine the larges common sequence between the
two most dissimilar paths [insignificant]
run through each path erasing the first COUNT characters where COUNT is the number of characters in the longest common sequence. O (N)
Best case time complexity: O(2N), worst case O(2N + N.logN) (can someone check that?)
#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
std::string common_substring(const std::string& l, const std::string& r)
{
return std::string(l.begin(),
std::mismatch(l.begin(), l.end(),
r.begin(), r.end()).first);
}
std::string mutating_common_substring(std::vector<std::string>& range)
{
if (range.empty())
return std::string();
else
{
if (not std::is_sorted(range.begin(), range.end()))
std::sort(range.begin(), range.end());
return common_substring(range.front(), range.back());
}
}
std::vector<std::string> chop(std::vector<std::string> samples)
{
auto str = mutating_common_substring(samples);
for (auto& s : samples)
{
s.erase(s.begin(), std::next(s.begin(), str.size()));
}
return samples;
}
int main()
{
std::vector<std::string> samples = {
"/home/user/foo.txt",
"/home/user/bar.txt",
"/home/baz.txt"
};
samples = chop(std::move(samples));
for (auto& s : samples)
{
std::cout << s << std::endl;
}
}
expected:
baz.txt
user/bar.txt
user/foo.txt
Here's an alternate `common_substring' which does not require a sort. time complexity is in theory O(N) but whether it's faster in practice you'd have to check:
std::string common_substring(const std::vector<std::string>& range)
{
if (range.empty())
{
return {};
}
return std::accumulate(std::next(range.begin(), 1), range.end(), range.front(),
[](auto const& best, const auto& sample)
{
return common_substring(best, sample);
});
}
update:
Elegance aside, this is probably the fastest way since it avoids any memory allocations, performing all transformations in-place. For most architectures and sample sizes, this will matter more than any other performance consideration.
#include <iostream>
#include <vector>
#include <string>
void reduce_to_common(std::string& best, const std::string& sample)
{
best.erase(std::mismatch(best.begin(), best.end(),
sample.begin(), sample.end()).first,
best.end());
}
void remove_common_prefix(std::vector<std::string>& range)
{
if (range.size())
{
auto iter = range.begin();
auto best = *iter;
for ( ; ++iter != range.end() ; )
{
reduce_to_common(best, *iter);
}
auto prefix_length = best.size();
for (auto& s : range)
{
s.erase(s.begin(), std::next(s.begin(), prefix_length));
}
}
}
int main()
{
std::vector<std::string> samples = {
"/home/user/foo.txt",
"/home/user/bar.txt",
"/home/baz.txt"
};
remove_common_prefix(samples);
for (auto& s : samples)
{
std::cout << s << std::endl;
}
}
You have to search every string in the list. However you don't need to compare all the characters in every string. The common prefix can only get shorter, so you only need to compare with "the common prefix so far". I don't think this changes the big-O complexity - but it will make quite a difference to the actual speed.
Also, these look like file names. Are they sorted (bearing in mind that many filesystems tend to return things in sorted order)? If so, you only need to consider the first and last elements. If they are probably pr mostly ordered, then consider the common prefix of the first and last, and then iterate through all the other strings shortening the prefix further as necessary.
You just have to iterate over every string. You can only avoid iterating over the full length of strings needlessly by exploiting the fact, that the prefix can only shorten:
#include <iostream>
#include <string>
#include <vector>
std::string common_prefix(const std::vector<std::string> &ss) {
if (ss.empty())
// no prefix
return "";
std::string prefix = ss[0];
for (size_t i = 1; i < ss.size(); i++) {
size_t c = 0; // index after which the string differ
for (; c < prefix.length(); c++) {
if (prefix[c] != ss[i][c]) {
// strings differ from character c on
break;
}
}
if (c == 0)
// no common prefix
return "";
// the prefix is only up to character c-1, so resize prefix
prefix.resize(c);
}
return prefix;
}
void strip_common_prefix(std::vector<std::string> &ss) {
std::string prefix = common_prefix(ss);
if (prefix.empty())
// no common prefix, nothing to do
return;
// drop the common part, which are always the first prefix.length() characters
for (std::string &s: ss) {
s = s.substr(prefix.length());
}
}
int main()
{
std::vector<std::string> ss { "/home/user/foo.txt", "/home/user/bar.txt", "/home/baz.txt"};
strip_common_prefix(ss);
for (std::string &s: ss)
std::cout << s << "\n";
}
Drawing from the hints of Martin Bonner's answer, you may implement a more efficient algorithm if you have more prior knowledge on your input.
In particular, if you know your input is sorted, it suffices to compare the first and last strings (see Richard's answer).
i - Find the file which has the least folder depth (i.e. baz.txt) - it's root path is home
ii - Then go through the other strings to see if they start with that root.
iii - If so then remove root from all the strings.
Start with std::size_t index=0;. Scan the list to see if characters at that index match (note: past the end does not match). If it does, advance index and repeat.
When done, index will have the value of the length of the prefix.
At this point, I'd advise you to write or find a string_view type. If you do, simply create a string_view for each of your strings str with start/end of index, str.size().
Overall cost: O(|prefix|*N+N), which is also the cost to confirm that your answer is correct.
If you don't want to write a string_view, simply call str.erase(str.begin(), str.begin()+index) on each str in your vector.
Overall cost is O(|total string length|+N). The prefix has to be visited in order to confirm it, then the tail of the string has to be rewritten.
Now the cost of the breadth-first is locality, as you are touching memory all over the place. It will probably be more efficient in practice to do it in chunks, where you scan the first K strings up to length Q and find the common prefix, then chain that common prefix plus the next block. This won't change the O-notation, but will improve locality of memory reference.
for(vector<string>::iterator itr=V.begin(); itr!=V.end(); ++itr)
itr->erase(0,6);

Is a compile-time checked string-to-int map possible?

I'm probably trying to achieve the impossible, but StackExchange always surprises me, so please have a go at this:
I need to map a name to an integer. The names (about 2k) are unique. There will be no additions nor deletions to that list and the values won't change during runtime.
Implementing them as const int variables gives me compile-time checks for existence and type.
Also this is very clear and verbose in code. Errors are easily spotted.
Implementing them as std::map<std::string, int> gives me a lot of flexibility for building the names to look up with string manipulation. I may use this to give strings as parameters to functions which than can query the list for multiple values by appending pre-/suffixes to that string. I can also loop over several values by creating a numeral part of the key name from the loop variable.
Now my question is: is there a method to combine both advantages? The missing compile-time check (especially for key-existence) almost kills the second method for me. (Especially as std::map silently returns 0 if the key doesn't exist which creates hard to find bugs.) But the looping and pre-/suffix adding capabilities are so damn useful.
I would prefer a solution that doesn't use any additional libraries like boost, but please suggest them nevertheless as I might be able to re-implement them anyway.
An example on what I do with the map:
void init(std::map<std::string, int> &labels)
{
labels.insert(std::make_pair("Bob1" , 45 ));
labels.insert(std::make_pair("Bob2" , 8758 ));
labels.insert(std::make_pair("Bob3" , 436 ));
labels.insert(std::make_pair("Alice_first" , 9224 ));
labels.insert(std::make_pair("Alice_last" , 3510 ));
}
int main()
{
std::map<std::string, int> labels;
init(labels);
for (int i=1; i<=3; i++)
{
std::stringstream key;
key << "Bob" << i;
doSomething(labels[key.str()]);
}
checkName("Alice");
}
void checkName(std::string name)
{
std::stringstream key1,key2;
key1 << name << "_first";
key2 << name << "_last";
doFirstToLast(labels[key1.str()], labels[key2.str()]);
}
Another goal is that the code shown in the main() routine stays as easy and verbose as possible. (Needs to be understood by non-programmers.) The init() function will be code-generated by some tools. The doSomething(int) functions are fixed, but I can write wrapper functions around them. Helpers like checkName() can be more complicated, but need to be easily debuggable.
One way to implement your example is using an enum and token pasting, like this
enum {
Bob1 = 45,
Bob2 = 8758,
Bob3 = 436,
Alice_first = 9224,
Alice_last = 3510
};
#define LABEL( a, b ) ( a ## b )
int main()
{
doSomething( LABEL(Bob,1) );
doSomething( LABEL(Bob,2) );
doSomething( LABEL(Bob,3) );
}
void checkName()
{
doFirstToLast( LABEL(Alice,_first), LABEL(Alice,_last) );
}
Whether or not this is best depends on where the names come from.
If you need to support the for loop use-case, then consider
int bob[] = { 0, Bob1, Bob2, Bob3 }; // Values from the enum
int main()
{
for( int i = 1; i <= 3; i++ ) {
doSomething( bob[i] );
}
}
I'm not sure I understand all your requirements, but how about something like this, without using std::map.
I am assuming that you have three strings, "FIRST", "SECOND" and "THIRD" that you
want to map to 42, 17 and 37, respectively.
#include <stdio.h>
const int m_FIRST = 0;
const int m_SECOND = 1;
const int m_THIRD = 2;
const int map[] = {42, 17, 37};
#define LOOKUP(s) (map[m_ ## s])
int main ()
{
printf("%d\n", LOOKUP(FIRST));
printf("%d\n", LOOKUP(SECOND));
return 0;
}
The disadvantage is that you cannot use variable strings with LOOKUP. But now you can iterate over the values.
Maybe something like this (untested)?
struct Bob {
static constexpr int values[3] = { 45, 8758, 436 };
};
struct Alice {
struct first {
static const int value = 9224;
};
struct last {
static const int value = 3510;
};
};
template <typename NAME>
void checkName()
{
doFirstToLast(NAME::first::value, NAME::last::value);
}
...
constexpr int Bob::values[3]; // need a definition in exactly one TU
int main()
{
for (int i=1; i<=3; i++)
{
doSomething(Bob::values[i]);
}
checkName<Alice>();
}
Using enum you have both compile-time check and you can loop over it:
How can I iterate over an enum?

C++ double sorting data with multiple elements

I have multiple data entries that contain the following information:
id_number
name1
date
name2
It is possible to put this into a struct like this:
struct entry {
int id_number;
string name1;
int date;
string name2;
}
In my data, I have many such entries and I would like to sort. First, I want to sort alphabetically based on name1, then sort by date. However, the sort by date is a subset of the alphabetical sort, e.g. if I have two entries with the same name1, I then want to order those entries by date. Furthermore, when I sort, I want the elements of the entry to remain together, so all four values go together.
My questions are the following:
1) What type of data structure should I use to hold this data so I can keep the set of four elements together when I sort any by any one of them?
2) What is the quickest way to do this sorting (in terms of amount of time to write the code). Ideally, I want to use something like the sort in algorithms.h since it is already built in.
3) Does STL have some built in data structure that can handle the double sorting I described efficiently?
The struct you have is fine, except that you may want to add an overload of operator< to do comparison. Here I'm doing the "compare by name, then date" comparison:
// Add this as a member function to `entry`.
bool operator<(entry const &other) const {
if (name1 < other.name1)
return true;
if (name1 > other.name1)
return false;
// otherwise name1 == other.name1
// so we now fall through to use the next comparator.
if (date < other.date)
return true;
return false;
}
[Edit: What's required is called a "strict weak ordering". If you want to get into detail about what the means, and what alternatives are possible, Dave Abrahams wrote quite a detailed post on C++ Next about it.
In the case above, we start by comparing the name1 fields of the two. If a<b, then we immediately return true. Otherwise, we check for a>b, and if so we return false. At that point, we've eliminated a<b and a>b, so we've determined that a==b, in which case we test the dates -- if a<b, we return true. Otherwise, we return false -- either the dates are equal, or b>a, either of which means the test for a<b is false. If the sort needs to sort out (no pun intended) which of those is the case, it can call the function again with the arguments swapped. The names will still be equal, so it'll still come down to the dates -- if we get false, the dates are equal. If we get true on the swapped dates, then what started as the second date is actually greater. ]
The operator< you define in the structure defines the order that will be used by default. When/if you want you can specify another order for the sorting to use:
struct byid {
bool operator<(entry const &a, entry const &b) {
return a.id_number < b.id_number;
}
};
std::vector<entry> entries;
// sort by name, then date
std::sort(entries.begin(), entries.end());
// sort by ID
std::sort(entries.begin(), entries.end(), byid());
That data structure right there should work just fine. What you should do is override the less than operator, then you could just insert them all in a map and they would be sorted. Here is more info on the comparison operators for a map
Update: upon farther reflection, I would use a set, and not a map, because there is no need for a value. But here is proof it still works
Proof this works:
#include<string>
#include<map>
#include<stdio.h>
#include <sstream>
using namespace std;
struct entry {
int m_id_number;
string m_name1;
int m_date;
string m_name2;
entry( int id_number, string name1, int date, string name2) :
m_id_number(id_number),
m_name1(name1),
m_date(date),
m_name2(name2)
{
}
// Add this as a member function to `entry`.
bool operator<(entry const &other) const {
if (m_name1 < other.m_name1)
return true;
if (m_name2 < other.m_name2)
return true;
if (m_date < other.m_date)
return true;
return false;
}
string toString() const
{
string returnValue;
stringstream out;
string dateAsString;
out << m_date;
dateAsString = out.str();
returnValue = m_name1 + " " + m_name2 + " " + dateAsString;
return returnValue;
}
};
int main(int argc, char *argv[])
{
string names1[] = {"Dave", "John", "Mark", "Chris", "Todd"};
string names2[] = {"A", "B", "C", "D", "E", "F", "G"};
std::map<entry, int> mymap;
for(int x = 0; x < 100; ++x)
{
mymap.insert(pair<entry, int>(entry(0, names1[x%5], x, names2[x%7]), 0));
}
std::map<entry, int>::iterator it = mymap.begin();
for(; it != mymap.end() ;++it)
{
printf("%s\n ", it->first.toString().c_str());
}
return 0;
}
Actually you can use function object to implement your sorting criteria
suppose that you would like to store the entries in the set
//EntrySortCriteria.h
class EntrySortCriteria
{
bool operator(const entry &e1, const entry &e2) const
{
return e1.name1 < e2.name1 ||
(!(e1.name1 < e2.name1) && e1.date < e2.date))
}
}
//main.cc
#include <iostream>
#include "EntrySortCriteria.h"
using namespace std;
int main(int argc, char **argv)
{
set<entry, EntrySortCriteria> entrySet;
//then you can put entries into this set,
//they will be sorted automatically according to your criteria
//syntax of set:
//entrySet.insert(newEntry);
//where newEntry is a object of your entry type
}

Concise lists/vectors in C++

I'm currently translating an algorithm in Python to C++.
This line EXCH_SYMBOL_SETS = [["i", "1", "l"], ["s", "5"], ["b", "8"], ["m", "n"]]
is now
vector<vector<char>> exch_symbols;
vector<char> vector_1il;
vector_1il.push_back('1');
vector_1il.push_back('i');
vector_1il.push_back('l');
vector<char> vector_5s;
vector_5s.push_back('5');
vector_5s.push_back('s');
vector<char> vector_8b;
vector_8b.push_back('8');
vector_8b.push_back('b');
vector<char> vector_mn;
vector_mn.push_back('m');
vector_mn.push_back('n');
exch_symbols.push_back(vector_1il);
exch_symbols.push_back(vector_5s);
exch_symbols.push_back(vector_8b);
exch_symbols.push_back(vector_mn);
I hate to have an intermediate named variable for each inner variable in a 2-D vector. I'm not really familiar with multidimensional datastructures in C++. Is there a better way?
What's happening afterwards is this:
multimap<char, char> exch_symbol_map;
/*# Insert all possibilities
for symbol_set in EXCH_SYMBOL_SETS:
for symbol in symbol_set:
for symbol2 in symbol_set:
if symbol != symbol2:
exch_symbol_map[symbol].add(symbol2)*/
void insert_all_exch_pairs(const vector<vector<char>>& exch_symbols) {
for (vector<vector<char>>::const_iterator symsets_it = exch_symbols.begin();
symsets_it != exch_symbols.end(); ++symsets_it) {
for (vector<char>::const_iterator sym1_it = symsets_it->begin();
sym1_it != symsets_it->end(); ++sym1_it) {
for (vector<char>::const_iterator sym2_it = symsets_it->begin();
sym2_it != symsets_it->end(); ++sym2_it) {
if (sym1_it != sym2_it) {
exch_symbol_map.insert(pair<char, char>(*sym1_it, *sym2_it));
}
}
}
}
}
So this algorithm should work in one way or another with the representation here. The goal is that EXCH_SYMBOL_SETS can be easily changed later to include new groups of chars or add new letters to existing groups. Thank you!
I would refactor, instead of vector<char>, use std::string as internal, i.e.
vector<string> exch_symbols;
exch_symbols.push_back("1il");
exch_symbols.push_back("s5");
exch_symbols.push_back("b8");
exch_symbols.push_back("mn");
then change your insert method:
void insert_all_exch_pairs(const vector<string>& exch_symbols)
{
for (vector<string>::const_iterator symsets_it = exch_symbols.begin(); symsets_it != exch_symbols.end(); ++symsets_it)
{
for (string::const_iterator sym1_it = symsets_it->begin(); sym1_it != symsets_it->end(); ++sym1_it)
{
for (string::const_iterator sym2_it = symsets_it->begin(); sym2_it != symsets_it->end(); ++sym2_it)
{
if (sym1_it != sym2_it)
exch_symbol_map.insert(pair<char, char>(*sym1_it, *sym2_it));
}
}
}
}
You could shorten it by getting rid of the intermediate values
vector<vector<char> > exch_symbols(4, vector<char>()); //>> is not valid in C++98 btw.
//exch_symbols[0].reserve(3)
exch_symbols[0].push_back('i');
etc.
You could also use boost.assign or something similiar
EXCH_SYMBOL_SETS = [["i", "1", "l"], ["s", "5"], ["b", "8"], ["m", "n"]] then becomes
vector<vector<char>> exch_symbols(list_of(vector<char>(list_of('i')('1')('l')))(vector<char>(list_of('s')('5'))(list_of('m')('n'))) (not tested and never used it with nested vectors, but it should be something like this)
For your real question of...
how could I translate L = [A, [B],
[[C], D]]] to C++ ... at all!
There is no direct translation - you've switched from storing values of the same type to storing values of variable type. Python allows this because it's a dynamically typed language, not because it has a nicer array syntax.
There are ways to replicate the behaviour in C++ (e.g. a vector of boost::any or boost::variant, or a user defined container class that supports this behviour), but it's never going to be as easy as it is in Python.
Your code:
vector<char> vector_1il;
vector_1il.push_back('1');
vector_1il.push_back('i');
vector_1il.push_back('l');
Concise code:
char values[] = "1il";
vector<char> vector_1il(&values[0], &values[3]);
Is it fine with you?
If you want to use std::string as suggested by Nim, then you can use even this:
//Concise form of what Nim suggested!
std::string s[] = {"1il", "5s", "8b", "mn"};
vector<std::string> exch_symbols(&s[0], &s[4]);
Rest you can follow Nim's post. :-)
In c++0x the instruction
vector<string> EXCH_SYMBOL_SETS={"i1l", "s5", "b8", "mn"} ;
compiles and works fine. Sadly enough the apparently similar statement
vector<vector<char>> EXCH_SYMBOL_SETS={{'i','1','l'},{'s','5'}, {'b','8'}, {'m','n'}};
doesn't work :-(.
This is implemented in g++ 4.5.0 or later you should add the -std=c++0x option. I think this feature is not yet avaliable in microsoft c (VC10), and I don't know what's the status of other compilers.
I know that this is an old post, but in case anyone stumbles across it, C++ has gotten MUCH better at dealing with this stuff:
In c++11 the first code block can simply be re-written in as:
std::vector<std::string> exch_symbols {"1il", "5s", "8b", "mn"};
This isn't special to string either, we can nest vector like so:
std::vector<std::vector<int>> vov {{1,2,3}, {2,3,5,7,11}};
And here's the entire code in c++14-style, with an added cout at the end:
#include <iostream>
#include <map>
#include <string>
#include <vector>
void add_all_char_pairs (std::multimap<char, char> & mmap, const std::string & str)
{
// we choose not to add {str[i], str[i]} pairs for some reason...
const int s = str.size();
for (int i1 = 0; i1 < s; ++i1)
{
char c1 = str[i1];
for (int i2 = i1 + 1; i2 < s; ++i2)
{
char c2 = str[i2];
mmap.insert({c1, c2});
mmap.insert({c2, c1});
}
}
}
auto all_char_pairs_of_each_str (const std::vector<std::string> & strs)
{
std::multimap<char, char> mmap;
for (auto & str : strs)
{
add_all_char_pairs(mmap, str);
}
return mmap;
}
int main ()
{
std::vector<std::string> exch_symbols {"1il", "5s", "8b", "mn"};
auto mmap = all_char_pairs_of_each_str(exch_symbols);
for (auto e : mmap)
{
std::cout << e.first << e.second << std::endl;
}
}