Cache locality with unique_ptr - c++

I have a vector of custom classes (std::string just for example).
The vector is large and I iterate through often, so I rely on cache locality.
I also have one raw pointer which points at one of the vector elements.
Now is the trick:
The vector is sorted from time to time, so the raw pointer loose the actual pointed element value, and will point to some random element value.
Here is an example to illustrate the same:
#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
#include <memory>
using namespace std;
int main()
{
vector<string> v = {"9","3", "8", "7", "6", "5", "1", "4", "2"};
string* rs = &v[7]; //point to the 7th element
for (size_t i = 0; i < v.size(); ++i)
cerr << v[i];
cerr << endl;
cerr << "Referenced string: " << rs->c_str() << endl;
cerr << "Sort ..." << endl;
sort(v.begin(), v.end(), [](const string& a, const string& b)
{
if (a < b)
return true;
else
return false;
}
);
for (size_t i = 0; i < v.size(); ++i)
cerr << v[i];
cerr << endl;
cerr << "Referenced string: " << rs->c_str() << endl;
cin.get();
return 0;
}
Output:
938765142
Referenced string before sort : 4
Sort ...
123456789
Referenced string after sort : 8
Since I wish the rs pointer to keep pointing to the 7th element value (which is 4) even after the sort, I came up with the following solution (vector of pointers):
#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
#include <memory>
using namespace std;
int main()
{
vector<unique_ptr<string>> v;
v.resize(9);
v[0] = make_unique<string>("9");
v[1] = make_unique<string>("3");
v[2] = make_unique<string>("8");
v[3] = make_unique<string>("7");
v[4] = make_unique<string>("6");
v[5] = make_unique<string>("5");
v[6] = make_unique<string>("1");
v[7] = make_unique<string>("4");
v[8] = make_unique<string>("2");
string* rs = v[7].get();
for (size_t i = 0; i < v.size(); ++i)
cerr << v[i]->c_str();
cerr << endl;
cerr << "Referenced string before sort: " << rs->c_str() << endl;
cerr << "Sort ..." << endl;
sort(v.begin(), v.end(), [](const unique_ptr<string>& a, const unique_ptr<string>& b)
{
if (*a < *b)
return true;
else
return false;
}
);
for (size_t i = 0; i < v.size(); ++i)
cerr << v[i]->c_str();
cerr << endl;
cerr << "Referenced string after sort: " << rs->c_str() << endl;
cin.get();
return 0;
}
Output:
938765142
Referenced string before sort: 4
Sort ...
123456789
Referenced string after sort: 4
While this latter solution works, there is a price: I have lost the cache locality of my vector, since I store pointers in it, rather than the actual objects.
Is there a way to maintain cache locality (e.g.: store my actual objects in the vector), and somehow manage to rs pointer to keep track where its pointed value wander around due to the sorts?
Or from the other perspective, is there a way to achieve cache locality with the vector of pointers?
Solution from Pubby, thanks!:
#include <iostream>
#include <algorithm>
#include <string>
#include <vector>
#include <memory>
using namespace std;
int main()
{
vector<string> data = { "d","e", "f", "g", "i", "b", "c", "a", "h" };
vector<int> indexes = {0,1,2,3,4,5,6,7,8};
int si = 6;
for (size_t i = 0; i < indexes.size(); ++i)
cerr << indexes[i];
cerr << endl;
for (size_t i = 0; i < indexes.size(); ++i)
cerr << data[indexes[i]];
cerr << endl;
cerr << "Referenced string before sort: " << data[si] << endl;
cerr << "Sort ..." << endl;
sort(indexes.begin(), indexes.end(), [&](const int a, const int b)
{
return data[a] < data[b];
}
);
for (size_t i = 0; i < indexes.size(); ++i)
cerr << indexes[i];
cerr << endl;
for (size_t i = 0; i < indexes.size(); ++i)
cerr << data[indexes[i]];
cerr << endl;
cerr << "Referenced string after sort: " << data[si] << endl;
cin.get();
return 0;
}

You can increase locality by storing the strings in a vector which doesn't change, and then store a vector of pointers/indexes to these strings.
Like this:
vector<string> data = {"9","3", "8", "7", "6", "5", "1", "4", "2"};
vector<unsigned> indexes(data.size());
std::iota(indexes.begin(), indexes.end(), 0u);
To sort your data you'd sort indexes using a custom comparator function which retrieves the values from data and compares them. Remember: indexes can change, but data should not!
sort(indexes.begin(), indexes.end(), [&](unsigned a, unsigned b)
{
return data[a] < data[b];
});

Just an idea: Instead of storing std::string in the vector, just append the character arrays of each string to a std::vector<char>.
This packs the strings closely together in memory, improving locality even better than std::string with small string optimization. It will also give better results if the strings exceed the max. size for small string optimization.
For sorting, store index and size of each string in a 2nd vector similar to Pubbys suggestion.
Of course this only works if the string length doesn't need to change dynamically. Otherwise you would have to rebuild the vector<char>.
#include <iostream>
#include <algorithm>
#include <vector>
#include <utility>
#include <string_view>
using namespace std;
using IndexAndSize = pair<size_t,size_t>;
void push_and_index( vector<char>& v, vector<IndexAndSize>& vi, string_view s )
{
vi.emplace_back( v.size(), s.size() );
v.insert( end(v), begin(s), end(s) );
}
string_view make_string_view( vector<char> const& v, IndexAndSize is )
{
return { v.data() + is.first, is.second };
}
int main()
{
vector<char> v;
vector<IndexAndSize> vi;
push_and_index( v, vi, "foo" );
push_and_index( v, vi, "bar" );
push_and_index( v, vi, "foobar" );
push_and_index( v, vi, "barfoo" );
sort( begin(vi), end(vi), [&]( IndexAndSize a, IndexAndSize b )
{
return make_string_view( v, a ) < make_string_view( v, b );
});
for( IndexAndSize is : vi )
{
cout << make_string_view( v, is ) << endl;
}
}
Live demo on Coliru.
Note: C++17's string_view is used only to help with the sorting and output, it's not crucial for this idea.

Related

c++ stack overflow due to recursive function how can I improve the data handling

I'm tackling a exercise which is designed to cause exactly this problem, of overloading the memory. Pretty much I'm loading various file sizes from 1,000 to 5 million lines of entries like this in a txt file (1 line = 1 entry):
SHFIv,aiSdG
PlgNB,bPHoP
ZHWJU,gfwgC
UAygL,Vqvhi
BlyzX,LLbCo
jbvrT,Utblj
...
pretty much every entry has 2 values separated by comma, in my code, I separate these values and try to find another matching value, there are always only 2 exactly matching values and each time 1 value is found the other one with which it is paired points to another pair, and so on until the final one gets found.
For example SHFIv,aiSdG would point to aiSdG,YDUVo.
I know my code is not very efficient, partly due to using recursion, but I could'nt figure out a better way to do the job, so any suggestions on how to possibly improve it to handle larger inputs would be greatly appriciated
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <map>
#include <unordered_map>
#include <stdio.h>
#include <vector>
#include <iterator>
#include <utility>
#include <functional>
#include <algorithm>
using namespace std;
template<typename T>
void search_bricks_backwards(string resume, vector<T>& vec, vector<string>& vec2) {
int index = 0;
for (const auto& pair : vec) {
//cout << "iteration " << index << endl;
if (pair.second == resume) {
vec2.insert(vec2.begin(), resume);
cout << "found " << resume << " and " << pair.second << endl;
search_bricks_backwards(pair.first, vec, vec2);
}
if (index + 1 == vec.size()) {
cout << "end of backward search, exitting..." << endl;
}
index++;
}
}
template<typename T>
void search_bricks(string start, vector<T>& vec, vector<string>& vec2) {
int index = 0;
for (const auto& pair : vec) {
//cout << "iteration " << index << endl;
if (pair.first == start) {
vec2.push_back(start);
cout << "found " << start << " and " << pair.first << endl;
search_bricks(pair.second, vec, vec2);
}
if (index + 1 == vec.size()) {
//search_bricks_backwards(start, vec, vec2);
// this also gets called on every recursion rather than just once
// as I originally intended when the forward iteration gets finished
}
index++;
}
}
template<typename T> // printing function
void printVectorElements(vector<T>& vec)
{
for (auto i = 0; i < vec.size(); ++i) {
cout << "(" << vec.at(i).first << ","
<< vec.at(i).second << ")" << endl ;
}
cout << endl;
}
vector<string> split(string s, string delimiter) { // filtering function
size_t pos_start = 0, pos_end, delim_len = delimiter.length();
string token;
vector<string> res;
while ((pos_end = s.find(delimiter, pos_start)) != string::npos) {
token = s.substr(pos_start, pos_end - pos_start);
pos_start = pos_end + delim_len;
res.push_back(token);
}
res.push_back(s.substr(pos_start));
return res;
}
int main()
{
vector<pair<string, string>> bricks;
vector<string> sorted_bricks;
ifstream inFile;
inFile.open("input-pairs-5K.txt"); // transferring data from .txt to a string
stringstream strStream;
strStream << inFile.rdbuf();
string str = strStream.str();
istringstream iss(str);
for (string line; getline(iss, line); )
// filtering data from string and dividing on ","
{
string delimiter = ",";
string s = line;
vector<string> v = split(s, delimiter);
string s1 = v.at(0);
string s2 = v.at(1);
bricks.push_back(make_pair(s1, s2));
}
search_bricks(bricks[0].second, bricks, sorted_bricks);
//printVectorElements(bricks);
//for (auto i = sorted_bricks.begin(); i != sorted_bricks.end(); ++i)
//cout << *i << " "; // this is just to check if vectors have data
}
Here is link to the 1k test data that works for me (only for the search bricks without backwards searching since it triggers on every recursion) again thanks for any suggestions on how to improve or get rid of the recursion. I don't code in c++ often and don't really know how else to tackle this.
Although implementing non-recursive version of your algorithm is canonical solution, if you really need to solve the problem without code modification, you can increase the stack size by modifying compiler option. ~100Mb will be usually sufficient.
In MSVC : /STACK:commit 104857600
In gcc : --stack, 104857600

How to print certain elements from vector?

I am trying to print out whatever is necessary from my program. What it does is it takes a long list from a text file and sort it based on first choice and GPA and put it into a vector. I manage to sort by First choice and GPA however how can I remove whatever output that isn't necessary?
This is an example of my Txt File (The sequence of each line is 1st choice, 2nd choice, 3rd choice, GPA, Name):
CC,DR,TP,3.8,AlexKong
SN,SM,TP,4,MarcusTan
DR,TP,SC,3.6,AstaGoodwin
SC,TP,DR,2.8,MalcumYeo
SN,SM,TP,3.7,DavidLim
SN,SM,TP,3.2,SebastianHo
SC,TP,DR,4,PranjitSingh
DR,TP,SC,3.7,JacobMa
and so on...
This is my output now (it is a long vector):
TP,DR,SC,4,SitiZakariah
TP,DR,SC,3.9,MuttuSami
TP,DR,SC,3.5,SabrinaEster
TP,DR,SC,3,KarimIlham
TP,DR,SC,3,AndryHritik
SN,SM,TP,4,MarcusTan
SN,SM,TP,3.8,MarcusOng
SN,SM,TP,3.7,DavidLim
SN,SM,TP,3.4,MollyLau
SN,SM,TP,3.2,SebastianHo
SN,SM,TP,3.2,NurAfiqah
SN,SM,TP,2.4,TanXiWei
SC,TP,DR,4,SallyYeo
SC,TP,DR,4,PranjitSingh
SC,TP,DR,3.6,RanjitSing
SC,TP,DR,2.8,MalcumYeo
SC,TP,DR,2.8,AbdulHalim
SC,TP,DR,2.7,AlifAziz
DR,TP,SC,3.9,SitiAliyah
DR,TP,SC,3.9,LindaChan
DR,TP,SC,3.8,SohLeeHoon
DR,TP,SC,3.7,PrithikaSari
DR,TP,SC,3.7,NurAzizah
DR,TP,SC,3.7,JacobMa
DR,TP,SC,3.6,AstaGoodwin
CC,DR,TP,3.9,MuruArun
CC,DR,TP,3.8,AlexKong
CC,DR,TP,3.7,DamianKoh
CC,DR,TP,3.3,MattWiliiams
CC,DR,TP,3.3,IrfanMuhaimin
And this is the output that I need (Basically students with CC as their 1st choice without displaying the 3 options):
3.9,MuruArun
3.8,AlexKong
3.7,DamianKoh
3.3,MattWiliiams
3.3,IrfanMuhaimin
This is my program.
#include <iostream>
#include <vector>
#include <fstream>
#include <string>
#include <algorithm>
using namespace std;
struct greater
{
template<class T>
bool operator()(T const &a, T const &b) const { return a > b; }
};
void main()
{
vector<string> v;
int p = 0;
ifstream File;
File.open("DSA.txt");
if (!File.is_open()) return;
string First;
cout << "Round 1:\n";
while (File >> First)
{
v.push_back(First);
p++;
}
for (int i = 0; i < v.size(); i++)
{
sort(v.begin(), v.end(), greater());
cout << v[i] << endl;
}
}
your last for loop:
for (int i = 0; i < v.size(); i++)
{
sort(v.begin(), v.end(), greater());
cout << v[i].substr(9) << endl;
}
EDIT:
If you want to only display ones with CC as 1st choice you can add if statement to your loop:
for (int i = 0; i < v.size(); i++)
{
if (v[i].substr(0,2) != "CC") continue;
cout << v[i].substr(9) << endl;
}
Also, I noticed another problem in your code. You should not sort the vector at every iteration. You should do it only once before the loop:
sort(v.begin(), v.end(), greater());
for (int i = 0; i < v.size(); i++)
{
if (v[i].substr(0,2) != "CC") continue;
cout << v[i].substr(9) << endl;
}
as I propose in the comment,
since the data is well defined as a structure, you can interpret semantically each row and filter according to that: here is what am talking about
int main()
{
std::vector<std::string> v;
std::string r = "CC,DR,TP,3.9,MuruArun";
std::string delimiter = ",";
std::string token = r.substr(0, r.find(delimiter));
if(token == ??)// compare to what ever you want
{
v.emplace_back(r);
}
cout << "token: " << token << endl;
cout << v.size() << endl;
return 0;
}

How to store different character's positon using vector or map

I have a string like "aabcdba" now I want to store the position of different character's position. I am trying to store using vector and unordered_map. Is there any good approach to store the position of different characters?
void topKFrequent(string s) {
vector<vector<int> >v(123);
//unordered_map<char, vector<int>>m;
for(int i=0;i<s.size();i++) {
v[s[i]].push_back(i);
// m[s[i]].push_back(i);
}
for(int i=0;i<123;i++) {
for(int j=0;j<v[i].size();j++) {
char ch=i;
cout<<ch<<"->"<<v[i][j]<<endl;
}
}
}
if string = "aabcdba", I want the following result:
a->0,1,6;
b->2,5;
c->3;
d->4;
You could use a map<char, vector<unsigned int> >.
#include <iostream>
#include <map>
#include <string>
#include <vector>
using namespace std;
map<char, vector<unsigned int> > storePos(string s)
{
map<char, vector<unsigned int> > charPos;
for(int i=0;i<s.size();i++)
{
auto itr = charPos.find(s[i]);
if(itr != charPos.end())
{
itr->second.push_back(i);
}
else
{
charPos[s[i]] = vector<unsigned int>(1, i);
}
}
return charPos;
}
int main(void)
{
string example = "aabcdba";
auto result = storePos(example);
for(auto itr1 = result.begin(); itr1 != result.end(); itr1 ++)
{
cout << "Letter: " << itr1->first << ", Locations: ";
for(auto itr2 = itr1->second.begin(); itr2 != itr1->second.end();
itr2 ++)
{
cout << *itr2 << " ";
}
cout << endl;
}
}
If you really want to store ordinal positions in the original string sequence, you can do so with either an unordered or ordered map of char to vector, where char is the key, and the vector contains the positions. Using an unordered map will not give you the lexicographical ordering of keys you seem to be seeking, but will nonetheless give you accurate positional vectors.
#include <iostream>
#include <string>
#include <vector>
#include <unordered_map>
int main()
{
std::string s = "aabcdba";
std::unordered_map<char, std::vector<unsigned int>> mymap;
for (unsigned i=0; i<s.size(); ++i)
mymap[s[i]].push_back(i);
for (auto const& pr : mymap)
{
std::cout << pr.first << "->";
auto it = pr.second.cbegin();
std::cout << *it;
while (++it != pr.second.cend())
std::cout << ',' << *it;
std::cout << ";\n";
}
}
Output
d->4;
c->3;
b->2,5;
a->0,1,6;
If you want lexicographical ordering, the simplest alternative is to simply using a regular ordered map instead. Changing only this:
std::unordered_map<char, std::vector<unsigned int>> mymap;
to this:
std::map<char, std::vector<unsigned int>> mymap;
and including the appropriate header delivers us this for output:
a->0,1,6;
b->2,5;
c->3;
d->4;
which fits exactly what you seem to be looking for.
A possible implementation to store the positions could be using unordered_multimap: (where the key characters can be repeated).
void storePos(string s) {
unordered_multimap<char, int>m;
for(int i=0;i<s.size();i++) {
m.insert(make_pair(s[i],i));
}
}
[EDITED]
But the output may depend on how you use it, or print out the data.
For example, consider the use of a std::multimap instead of std::unordered_map, to populate it you just do:
multimap<char, int>m;
void storePos(string s) {
for(int i=0;i<s.size();i++) {
m.insert(make_pair(s[i],i));
}
}
And to print the data you could have the following method:
void printPos()
{
std::multimap<char,int>::iterator it,itup;
for (it = m.begin(); it != m.end(); )
{
cout << (*it).first << " -> ";
itup = m.upper_bound ((*it).first );
// print range [it,itup):
for (it; it!=itup; ++it)
{
cout << (*it).second << ", ";
}
cout << endl;
}
}
Output:
a -> 0, 1, 6,
b -> 2, 5,
c -> 3,
d -> 4,
Try this!

C++ array of pointers pointing to another arrays value

I would like to make an array of words like: "Tom", "Mike","Tamara","Nik"... I would like to make for user to be possible to enter for instance a number 3, and get a random return of words that have the length of 3 so eather ("Tom" or "Nik"). I think this is done with pointers but I don't know how. Words should be stored in different arrays depending on their length. And with pointers you would point to each array ("Tom","Nik" in same array "Tamara" in different array and "Mike" in different array and so on... because their length is not the same). Can someone please help ?
#include<iostream>
#include <string>
using namespace std;
void IzpisPolja(char **polje,int velikost){
int tab[100];
for (int i=0; i<velikost; i++) {
cout<<polje[i]<<endl;
char *zacasni;
tab[i] = strlen(polje[i]);
// cout<<tab[i]<<endl;
}
}
int main(){
const int size = 4;
char* tabelaOseb[size] = {"Tom", "Mike","Tamara","Nik"};
IzpisPolja(tabelaOseb,size);
return 0;
}
Do you want to do it efficiently ? Storing them in separate arrays will increase search time but also increase insertion, deletion complexity.
Otherwise you can just count number of instances of n length words in an array, then generate random number and return the ith of them.
Also suggest using std::vector
const string* getRandNameOfLength(const string* arr,
const int arrlen,
const int length)
{
int num = 0, j, i;
// Counting number of such names
for (i = 0; i < arrlen; ++i)
{
if (arr[i].size() == length)
num++;
}
// No such name found
if (num == 0)
return NULL;
j = rand() % num;
// Returning random entry of given length
for (i = 0; i < arrlen; ++i)
{
if (arr[i].size() == length && j-- == 0)
return &arr[i];
}
// Function shouldn't get here
return NULL;
}
You can use raw pointers to perform your task, of course, but you can also start using some of the many safer facilities that the language (references, iterators and smart pointers) and the C++ standard library can offer.
I'll show you a complete program that can do what you are asking using conteiners (std::vector, std::map) and algorithms (like std::lower_bound) that can really simplify your work once understood.
Note that as a learning exercise (for both of us), I have used as many "new" features as I could, even when maybe wasn't necessary or handy. Read the comments for better understanding.
The words are stored and managed in a class, while the interaction with the user is performed in main().
#include <iostream>
#include <string>
#include <vector>
#include <map>
#include <limits>
#include <algorithm>
#include <random> // for mt19937, uniform_int_distribution
#include <chrono> // for high_resolution_clock
size_t random_index( size_t a, size_t b ) {
// Initialize Random Number Generator Engine as a static variable. - Since c++11, You can use those instead of old srand(time(NULL))
static std::mt19937 eng{static_cast<long unsigned int>(std::chrono::high_resolution_clock::now().time_since_epoch().count())};
// use the RNG to generate random numbers uniformly distributed in a range
return std::uniform_int_distribution<size_t>(a,b)(eng);
}
using svs_t = std::vector< std::string >; // I store the words with equal length in a std::vector of std::string
// like typedef, I'll use svs_t instead of std::vector<std::string>
auto string_less_then = [] (const std::string & a, const std::string & b) -> bool { return a.compare(b) < 0; };
// A lambda function is a mechanism for specifying a function object, its primary use is to specify a simple
// action to be performed by some function. I'll use it to compare two string and return true only if a<b
class word_table {
std::map< size_t, svs_t > words; // std::map store elements formed by a combination of a key value and a mapped value, sorted by key
// I'll use word's length as a key for svs_t values
public:
word_table() {}
word_table( std::initializer_list<std::string> vs ) {
insert_words(vs);
}
void insert_words( svs_t vs ) {
for ( auto && s : vs ) add_word(s); // loop for each value in vs, "auto" let the compiler infer the right type of the variable
}
bool add_word( std::string s ) { // I choose to keep the vector sorted and with unique elements
size_t sl = s.length();
if ( sl > 0 ) {
auto & v = words[sl]; // If sl doesn't match the key of any element in the map, a new element is created
// lower_bound return an iterator that poins to the first element in range (begin,end)
auto it = std::lower_bound(v.begin(), v.end(), s, string_less_then); // which does not compare less than s
// I pass the compare function as a lambda
if ( it != v.end() && it->compare(s) == 0 ) return false; // Already present, duplicates not allowed
v.insert(it, s); // Not the most efficient way, but you seem focused on the random access part
return true;
}
return false;
}
bool remove_word( std::string s) {
size_t sl = s.length();
if ( sl > 0 ) {
auto itvw = words.find(sl); // first find the right element in the map, using the string length as a key, but if word is found
if ( itvw == words.end() ) return false; // an iterator to the element following the last element of the container is returned
auto & v = itvw->second; // In a map the elements are stored in pairs, first is the key, second the value
auto it = std::lower_bound(v.begin(), v.end(), s, string_less_then);
if ( it != v.end() && it->compare(s) == 0 ) {
v.erase(it);
if ( v.empty() ) words.erase(itvw);
return true;
}
}
return false;
}
std::string get_random_word( size_t length ) {
if ( length == 0 ) return "";
auto itvw = words.find(length);
if ( itvw == words.end() || itvw->second.empty() ) return "";
return itvw->second[random_index(0, itvw->second.size() - 1)];
}
void show_all() {
for ( auto && i : words ) {
std::cout << " ";
for (auto && w : i.second ) {
std::cout << w << ' ';
}
std::cout << '\n';
}
}
};
constexpr size_t ss_max = std::numeric_limits<std::streamsize>::max();
namespace opt {
enum options { wrong = -1, exit, show, random, add, remove, menu };
}
class menu {
std::map<int,std::string> opts;
public:
menu( std::initializer_list<std::pair<int,std::string>> il ) {
for ( auto && i : il ) opts.insert(i);
}
void show() {
std::cout << "\nYou can choose among these options:\n\n";
for ( auto && i : opts ) {
std::cout << " " << i.first << ". " << i.second << ".\n";
}
}
};
int main()
{
word_table names({"Tom", "Mike","Tamara","Robert","Lenny","Nick","Alex","Sue","Irina","Beth","Anastacia","Bo"});
int choise = opt::exit;
menu menu_options { {opt::exit, "Exit program"}, {opt::show, "Show all stored names"},
{opt::random, "Show a random name"}, {opt::add, "Add a new name"},
{opt::remove, "Remove a name"} };
menu_options.show();
do {
std::cout << "\nPlease, enter a number (" << opt::menu << " to show again all options): ";
std::cin >> choise;
if ( std::cin.fail() ) { // the user enter something that is not a number
choise = opt::wrong;
std::cin.clear();
std::cin.ignore(ss_max,'\n');
}
if ( std::cin.eof() ) break; // use only if you are redirecting input from file
std::string str;
switch ( choise ) {
case opt::exit:
std::cout << "\nYou choose to quit, goodbye.\n";
break;
case opt::show:
std::cout << "\nAll the stored names, classified by word\'s length:\n\n";
names.show_all();
break;
case opt::random:
size_t l;
std::cout << "Please, enter the length of the name: ";
std::cin >> l;
if ( std::cin.good() ) {
std::string rs = names.get_random_word(l);
if ( rs == "" ) {
std::cout << "\nNo name of length " << l << " has been found.\n";
} else {
std::cout << "\n " << rs << '\n';
}
}
break;
case opt::add:
std::cout << "Please, enter the name You want to add: ";
std::cin >> str; // read a string from cin, you can write more than a word (separeted by spaces)
std::cin.ignore(ss_max,'\n'); // but only the first is stored
if ( names.add_word(str) ) {
std::cout << "\n The name " << str << " has been successfully added.\n";
} else {
std::cout << "\n No name has been added";
if ( str != "" ) std::cout << ", "<< str << " is already present.\n";
else std::cout << ".\n";
}
break;
case opt::remove:
std::cout << "Please, enter the name You want to remove: ";
std::cin >> str;
if ( names.remove_word(str) ) {
std::cout << "\n " << str << " has been succesfully removed.\n";
} else {
std::cout << "\n No name has been removed";
if ( str != "" ) std::cout << ", " << str << " wasn't found.\n";
else std::cout << ".\n";
}
break;
case opt::menu:
menu_options.show();
break;
default:
std::cout << "\n Sorry, that's not an option.\n";
}
} while ( choise != opt::exit );
return 0;
}
I hope it could help.
#include <iostream>
#include <vector>
#include <cstring>
#include <ctime>
#include <cstdlib>
using namespace std;
const char* return_rand_name(const char** names, size_t length)
{
std::vector<size_t> indexes;
for(int i=0; names[i][0] != 0; ++i)
if(strlen(names[i]) == length)
indexes.push_back(i);
if(indexes.size()==0)
return NULL;
return names[indexes[rand()%indexes.size()]];
}
int main()
{
srand(time(NULL));
const char* names[] = {"Alex","Tom","Annie","Steve","Jesus","Leo","Jerry",""};
std::cout << return_rand_name(names, 3) << std::endl;
return 0;
}
And if you want to use functions like strlen etc, include <cstring>, not <string> (which contains class template std::string (which you should use in C++ (instead of char*) ) )

Find unique strings in C++, and generate associated lookup vector

A have a vector of strings in c++:
vector<string> myVect = {"A", "A", "A", "B", "B", "A", "C", "C", "foo", "A", "foo"};
How can I convert this to a vector of integers, so that each integer uniquely corresponds to a string in myVect?
i.e. I would like a vector
out = {0, 0, 0, 1, 1, 0, 2, 2, 3, 0, 3}
In addition, I would like a vector of the unique strings, each position corresponding to the number in out:
uniqueStrings = {"A", "B", "C", "foo"}
So far I have the following:
vector<string> uniqueStrings; // stores list of all unique strings
vector<int> out(myVect.size());
for (int i = 0; i < myVect.size(); ++i)
{
// seeing if this string has been encountered before
bool assigned = false;
for (int j = 0; j < uniqueStrings.size(); ++j)
if (!myVect.at(i).compare( uniqueStrings.at(j) ))
{
out.at(i) = j;
assigned = true;
break;
}
// if not, add new example to uniqueStrings
if (!assigned)
{
uniqueStrings.push_back(myVect.at(i));
out.at(i) = uniqueStrings.size();
}
}
This works, but surely there must be a better way?
Keep pushing them in a map where the string is the key and the value corresponds to the id of each string. Then the values of your map will uniquely correspond to the strings and the keys will be the unique strings.
Use a set.
# include <set>
...
set <string> uniqueStrings;
...
for (int i = 0; i < myVect.size(); ++i)
{
uniqueStrings.insert(myVect[i]);
}
Here's a more or less complete example of how you might use a std::map<> to maintain a mapping of unique strings to an integer ID:
#include <algorithm>
#include <iostream>
#include <map>
#include <string>
#include <vector>
using namespace std;
// a simple functor type that makes it easier to dump the contents of a
// container of simple values or a container of std::pair
struct dump
{
template <typename K, typename V>
void operator()( typename std::pair<K,V> const& x)
{
cout << x.first << " ==> " << x.second << endl;
}
template <typename T>
void operator()( T const& x)
{
cout << x << endl;
}
};
#define NUM_ELEM(x) (sizeof(x)/sizeof(x[0]))
char const* data[] = {"A", "A", "A", "B", "B", "A", "C", "C", "foo", "A", "foo"};
int main() {
// intialize the data set
vector<string> myVect( data, data + NUM_ELEM(data));
cout << "dump of initial data set" << endl << endl;
for_each( myVect.begin(), myVect.end(), dump());
map<string,size_t> uniqueStrings; // stores collection of all unique strings
for (vector<string>::iterator i = myVect.begin(); i != myVect.end(); ++i) {
// I'm using uniqueStrings.size() as a convenience here...
// I just needed something to generate unique ID's easily,
// it might not be appropriate to use size() for your ID's in real life
// this will insert the new mapping if there's not already one
uniqueStrings.insert( make_pair(*i, uniqueStrings.size()));
}
cout << endl << endl<< "dump of uniqueStrings" << endl << endl;
for_each( uniqueStrings.begin(), uniqueStrings.end(), dump());
// I'm not sure if you'd need this `out` vector anymore - you can probably just
// use the `uniqueStrings` map directly for this information (but that would
// depend on your specific needs)
vector<int> out;
for (vector<string>::iterator i = myVect.begin(); i != myVect.end(); ++i) {
out.push_back( uniqueStrings[*i]);
}
cout << endl << endl << "dump of `out` vector" << endl << endl;
for_each( out.begin(), out.end(), dump());
return 0;
}