Storing and accessing a collection of strings (STD C++)

Storing and accessing a collection of strings (STD C++) - c++

SKU1 SKU2 Description
"01234" "34545" "White Bread"
"01545" "34236" "Wheat Bread"
I need to cross-reference these three fields, i.e. retrieve SKU2 while knowing SKU1, SKU1 while knowing SKU2, and Description while knowing either SKU1 or SKU2.
I'm curious - what is the best way to do this? Vectors using search() or find()? Using a map somehow?
I currently have it working using a vector< vector<string> >, looping through the 'parent' vectors and the 'child' vectors, comparing the values, but this seems primitive.
Basically, I need a vector that uses any of its strings as an index to return one of the two other values. Is the general way I'm doing it considered acceptable/optimal?
vector< vector<string> > products;
int i = 0;
for( i = 0; i < 2; ++i)
{
products.push_back( vector<string>() );
products[i].push_back( "SKU1" );
products[i].push_back( "SKU2" );
products[i].push_back( "Description" );
}
Thanks for your assistance.

Boost BiMap.

I would recommend using two maps that index into an object that has the information you need:
struct MyInfo
{
std::string SKU1;
std::string SKU2;
std::string Description;
};
std::map<std::string, MyInfo *> SKU1map;
std::map<std::string, MyInfo *> SKU2map;
MyInfo * newProduct = new MyInfo; ///Do not forget to delete!!
newProduct->SKU1 = //SKU1 value
newProduct->SKU2 = //SKU2 value
newProduct->Description = //Description value
SKU1map[newProduct->SKU1] = newProduct;
SKU2map[newProduct->SKU2] = newProduct;
This will be a decently fast implementation(much better than linear search), and if you deal with many product instances, then it will also be more memory efficient.

Build three std::map<std::string, std::string>s: one to map SKU1s to SKU2s, one to map SKU1s to Descriptions, and one to map SKU2s to Descriptions. (Better yet, use std::unordered_map, if you have it (C++0x)).
This is assuming that you have a lot of data and are prioritizing speed rather than memory usage.

Related

map string to vector or map different keys to one value

I need to map a single string to multiple strings, to do this i thought about two different solutions:
The first is to map each string to a vector so that when i look at the key i get the vector in return. std::unordered_map<std::string, std::vector<std::string>>
Using this solution means that i need to look for a key only once but then i have to iterate on the array to find the correct string that i need.
The second solution i thought was to use each string contained in the vectors (i know that they are unique) as key and map them to what would've been the key in solution 1. std::unordered_map<std::string, std::string>
Using this solution means that i need to look for a key n times (where n is the length of the array in solution 1) and in my map i have the same value for many keys (i don't know if that matters in the end) but i would directly have the string that i need.
example 1:
std::unordered_map<std::string, std::vector<std::string>> map;
std::vector<std::string> arr = {"hello", "world"};
map["greetings"] = array;
example 2:
std::unordered_map<std::string, std::string> map;
map["hello"] = "greetings";
map["world"] = "greetings";
For the purpose of my program it doesn't matter what string I have in the end (the value from the array of solution 1 or the value from solution 2) as long as I have a way to map them to each other so both solutions are viable.
I don't have a way to know in advance the length of the array in solution 1.
Are there any major differences in the two solutions? Which one would be faster/use less memory on paper?

You have a mapping between one string and a sequence of strings (or perhaps a set of strings, if the insertion order isn't significant). Let us call the former keys and the latter values, despite your second example using them in reverse manner.
Example one allows you to efficiently find all values associated with a particular key. Therefore approach one is faster and approach two is slower.
Example two allows you to efficiently find the key to which a particular value is mapped to. Therefore approach two is faster and approach one is slower.
As you can see both examples are faster than the other.

Your two options do different things.
example 1:
std::unordered_map<std::string, std::vector<std::string>> map;
map["greetings"] = {"hello", "world"};
map["farewells"] = {"goodbye", "cruel", "world"};
for(auto && pair : map) {
for(auto && value : pair.second) {
std::cout << pair.first << value;
}
}
// greetings hello
// greetings world
// farewells goodbye
// farewells cruel
// farewells world
example 2:
std::unordered_map<std::string, std::string> map;
map["hello"] = "greetings";
map["world"] = "greetings";
map["goodbye"] = "farewells";
map["cruel"] = "farewells";
map["world"] = "farewells";
for(auto && pair : map) {
std::cout << pair.second << pair.first;
}
// greetings hello
// farewells goodbye
// farewells cruel
// farewells world

Strings in Vectors. and placing them in order

So I am placing objects in a vector. I want to drop them in order as they are added. the basics of the object are
class myObj {
private:
string firstName;
string lastName;
public:
string getFirst;
string getLast;
}
I also have a vector of these objects
vector< myObj > myVect;
vector< myObj >::iterator myVectit = myVect.begin();
when I add a new object to the vector I want to find where it should be placed before inserting it. Can I search a vector by an object value and how? This is my first attempt
void addanObj (myObj & objtoAdd){
int lowwerB = lower_bound(
myVect.begin().getLast(), myVect.end().getLast(), objtoAdd.getLast()
);
int upperB = upper_bound(
myVect.begin().getLast(), myVect.end().getLast(), objtoAdd.getLast()
);
from there i plan to use lowwerB and upper B to determine where to insert the entry. what do I need to do to get this to work or what is a better method of tackling this challenge?
----Follow up----
the error I get when I attempt to compile
error C2440: 'initializing' : cannot convert from 'std::string' to 'int'
No user-defined-conversion operator available that can perform this conversion,
or the operator cannot be called
The compiler highlights both lower_bound and upper_bound. I would guess it is referring to where I am putting
objtoAdd.getLast()
-----More Follow up-----------------
THis is close to compiling but not quite. What should I expect to get from lower_bound and upper_bound? It doesnt match the iterator i defined and im not sure what I should expect.
void addMyObj(myObj myObjtoadd)
vector< myObj>::iterator tempLB;
vector< myObj>::iterator tempUB;
myVectit= theDex.begin();
tempLB = lower_bound(
myVect.begin()->getLast(), myVect.end()->getLast(), myObjtoadd.getLast()
);
tempUB = upper_bound(
myVect.begin()->getLast(), myVect.end()->getLast(), myObjtoadd.getLast()
);

Your calls to std::lower_bound and std::upper_bound are incorrect. The first two parameters must be iterators that define a range of elements to search and the returned values are also iterators.
Since these algorithms compare the container elements to the third parameter value you'll also need to provide correct operator< functions that compare an object's lastName and a std::string. I've added two different compare functions since std::lower_bound and std::upper_bound pass the parameters in opposite order.
I think I have the machinery correct in this code, it should be close enough for you to get the idea.
class myObj {
private:
std::string firstName;
std::string lastName;
public:
std::string getFirst() const { return firstName; }
std::string getLast() const { return lastName; }
};
bool operator<(const myObj &obj, const std::string &value) // used by lower_bound()
{
return obj.getLast() < value;
}
bool operator<(const std::string &value, const myObj &obj) // used by upper_bound()
{
return value < obj.getLast();
}
int main()
{
std::vector<myObj> myVect;
std::vector<myObj>::iterator tempLB, tempUB;
myObj objtoAdd;
tempLB = std::lower_bound(myVect.begin(), myVect.end(), objtoAdd.getLast());
tempUB = std::upper_bound(myVect.begin(), myVect.end(), objtoAdd.getLast());
}

So this is definitely not the best way to go. Here's why:
Vector Size
A default vector starts out with 0 elements, but capacity to hold some number; say 100. After you add the 101st element, it has to completely recreate the vector, copy over all the data, and then delete the old memory. This copying can become expensive, if done enough.
Inserting into a vector
This is going to be even more of a problem. Because a vector is just a contiguous block of memory with objects stored in insert order, say you have the below:
[xxxxxxxzzzzzzzz ]
if you want to add 'y', it belongs between x and z, right? this means you need to move all the z's over 1 place. But because you are reusing the same block of memory, you need to do it one at a time.
[xxxxxxxzzzzzzz z ]
[xxxxxxxzzzzzz zz ]
[xxxxxxxzzzzz zzz ]
...
[xxxxxxx zzzzzzzz ]
[xxxxxxxyzzzzzzzz ]
(the spaces are for clarity - previous value isn't explicitly cleared)
As you can see, this is a lot of steps to make room for your 'y', and will be very very slow for large data sets.
A better solution
As others have mentioned, std::set sounds like it's more appropriate for your needs. std::set will automatically order all inserted elements (using a tree data structure for much faster insertion), and allows you to find particular data members by last name also in log(n) time. It does this by using bool myObj::operator(const & _myObj) const to know how to sort the different objects. If you simply define this operator to compare this->lastName < _myObj.lastName, you can simply insert into the set much quicker.
Alternately, if you really really want to use vector: instead of sorting it as you go, just add all the items to the vector, and then perform std::sort to sort them after all the inserts are done. This will also complete in n log(n) time, but should be considerably faster than the current approach because of the vector insertion problem.

Fast Retrieval of a Specific Object from a Collection of Pointers

I am trying to come up with techniques of accessing/retrieving an object from a container (map, vector, ) in the most efficient manor possible.
So if I have the object:
class Person
{
public:
string name;
unsigned int ID; // unique ID
double deposit;
};
// And then I have a vector of pointers to person objects
std::vector <Person*> people;
Person* getPerson( string nName );
Person* getPerson( unsigned int nID ); // what would be a good method to quickly identify/retrieve the correct Person object from my vector?
My ideas:
This is the iterative solution that is not efficient:
Person* getPerson( string nName )
{
for (int i=0; i<people.size(); i++)
{
if (people[i]->name == nName ) { return people[i]; }
}
}
Another way: have 2 maps
map <string, Person*> personNameMap;
Person* getPerson( string nName )
{
return personNameMap[nName];
}
map <string, Person*> personIDMap;
Person* getPerson( unsigned int nID )
{
char id[2];
atoi( nID, id, 10 ); // or is it itoa?
return personNameMap[id];
}
Any other ideas how I could store & retrieve my objects from a collection in a fast & efficient manor?

std::map stores its element in a balanced tree structure and provides quite good look up speed. But inserting in std::map is slower then in sequence containers for the same reasons. So map is your choice if yoh have a lot off look ups and quite small amount of insertions.
Besides that I don't understand exactle why you made map <string, Person*> personIDMap; instead of map <unsigned int, Person*> personIdMap.

std::map is a balanced tree that is O(log n) steps for searching. Boost offers boost::unordered_map which is a hash-map. It is asymptotically worse (O(n^2)), however, on average it performs better. Depending on the fullness of the container, it is 1-3 constant steps. Once the container gets filled (which means that the values of the keys get exhausted) there will be more and more collisions and the performance will degrade quickly. In most implementations this happens at around 80% fullness. This is not a problem in most cases, but be aware of this limitation.

Map is the fastest container for look up in C++ if index is not integers
I hope that is good

std::map keys in C++

I have a requirement to create two different maps in C++. The Key is of type CHAR* and the Value is a pointer to a struct. I am filling 2 maps with these pairs, in separate iterations. After creating both maps I need find all such instances in which the value of the string referenced by the CHAR* are same.
For this I am using the following code :
typedef struct _STRUCTTYPE
{
..
} STRUCTTYPE, *PSTRUCTTYPE;
typedef pair <CHAR *,PSTRUCTTYPE> kvpair;
..
CHAR *xyz;
PSTRUCTTYPE abc;
// after filling the information;
Map.insert (kvpair(xyz,abc));
// the above is repeated x times for the first map, and y times for the second map.
// after both are filled out;
std::map<CHAR *, PSTRUCTTYPE>::iterator Iter,findIter;
for (Iter=iteratedMap->begin();Iter!=iteratedMap->end();mapIterator++)
{
char *key = Iter->first;
printf("%s\n",key);
findIter=otherMap->find(key);
//printf("%u",findIter->second);
if (findIter!=otherMap->end())
{
printf("Match!\n");
}
}
The above code does not show any match, although the list of keys in both maps show obvious matches. My understanding is that the equals operator for CHAR * just equates the memory address of the pointers.
My question is, what should i do to alter the equals operator for this type of key or could I use a different datatype for the string?

My understanding is that the equals operator for CHAR* just equates the memory address of the pointers.
Your understanding is correct.
The easiest thing to do would be to use std::string as the key. That way you get comparisons for the actual string value working without much effort:
std::map<std::string, PSTRUCTTYPE> m;
PSTRUCTTYPE s = bar();
m.insert(std::make_pair("foo", s));
if(m.find("foo") != m.end()) {
// works now
}
Note that you might leak memory for your structs if you don't always delete them manually. If you can't store by value, consider using smart pointers instead.
Depending on your usecase, you don't have to neccessarily store pointers to the structs:
std::map<std::string, STRUCTTYPE> m;
m.insert(std::make_pair("foo", STRUCTTYPE(whatever)));
A final note: typedefing structs the way you are doing it is a C-ism, in C++ the following is sufficient:
typedef struct STRUCTTYPE {
// ...
} *PSTRUCTTYPE;

If you use std::string instead of char * there are more convenient comparison functions you can use. Also, instead of writing your own key matching code, you can use the STL set_intersection algorithm (see here for more details) to find the shared elements in two sorted containers (std::map is of course sorted). Here is an example
typedef map<std::string, STRUCTTYPE *> ExampleMap;
ExampleMap inputMap1, inputMap2, matchedMap;
// Insert elements to input maps
inputMap1.insert(...);
// Put common elements of inputMap1 and inputMap2 into matchedMap
std::set_intersection(inputMap1.begin(), inputMap1.end(), inputMap2.begin(), inputMap2.end(), matchedMap.begin());
for(ExampleMap::iterator iter = matchedMap.begin(); iter != matchedMap.end(); ++iter)
{
// Do things with matched elements
std::cout << iter->first << endl;
}

How can I change the value in a pair in maps

I can do:
map<char*, int> counter;
++counter["apple"];
But when I do:
--counter["apple"] // when counter["apple"] ==2;
I got debugger hung up in VS 2008.
Any hints?

Do you rely on the value of it? A string literal is not required to have the same address in different uses of it (especially when used in different translation units). So you may actually create two values by this:
counter["apple"] = 1;
counter["apple"] = 1;
Also you get no kind of any sorting, since what happens is that it sorts by address. Use std::string which does not have that problem as it's aware of the content and whose operator< compares lexicographical:
map<std::string, int> counter;
counter["apple"] = 1;
assert(++counter["apple"] == 2);

A map of the form:
map <char *, int> counter;
is not a very sensible structure, because it cannot manage the char pointers it contains effectively. Change the map to:
map <string, int> counter;
and see if that cures the problem.

I found the problem.
If I change it to:
map<string,int> counter;
counter["apple"]++;
if(counter["apple"]==1)
counter.erase("apple");
else
counter["apple"]--; //this will work
In the Key/value pair, if value is a int and value ==1, I somehow could not do map[key]--, ('cause that will make the value ==0?)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Storing and accessing a collection of strings (STD C++) - c++

Boost BiMap.

Related

map string to vector or map different keys to one value

Strings in Vectors. and placing them in order

Fast Retrieval of a Specific Object from a Collection of Pointers

std::map keys in C++

How can I change the value in a pair in maps

Categories

Resources