Moving object to front of vector pair string c++ - c++

I have a question about moving an Object to the front in a vector pair. I already read that one post but I didnt know how to use it with vector pairs or better said how to do it with strings. I dont know how to point with an iterator at the position I need because in my code I also need a count number to compare the two strings or how is it possible to compare something like if(it==ite){..} (i would need 2 loops for my examples so I would name the other one "ite" just as an example.
I have a vector<pair<string,string>> Dictionary and a vector string Text where I go through the dictionary and try to find the same word from the text in the dictionary. I already accomplished that but now I need to move the element that I found in the dictionary to the front and delete it from its old position.
I´m not sure how to do it with vector pairs.
Here´s my code to understand what I mean:
for(size_t i=0;i<Text.size();i++){
for(size_t j=0;j<Duden.size();j++){
if(Text[i]==Duden[j].first){
uebersetzung.push_back(Duden[j].first);
if(Duden[j].first.length()<4){
uebersetzung.push_back(" ");}
if(Duden[j].first.length()<8){
uebersetzung.push_back("\t");} // These are only so it looks cleaner at the end
uebersetzung.push_back("\t\t\t\t: ");
uebersetzung.push_back(Duden[j].second);
uebersetzung.push_back("\n");
// Now here should be the code to rotate the vector so the found element is now at the first position and not at the found position
break;
}
}
}
"Duden" here is dictionary if you´re wondering. I want to swap position of the elements from the dictionary so if the word occurs again in the text it doesnt need to go trough the whole dictionary again and instead just find it directly in the first positions.
How can I accomplish that with using rotate? Or do I need to do it with erase and insert because rotate doesnt work with pairs of vectors?

Consider making use of algorithms in the STL library. To use them, you have to get familiar with iterators. Then you can use something like std::rotate and it does not really matter what is in your vector.
For example your code can be refactored in the following way:
for (auto const& word : Text) {
auto it = std::find(Duden.begin(), Duden.end(), [&word](auto const& entry) {
return entry.first == word;
});
if(it == Duden.end()) {
continue;
}
generate_translation(uebersetzung, *it);
std::rotate(Duden.begin(), it, it+1);
}
The function generate_translation(std::vector<string>&, std::pair<string, string> const&) should be an own function to make the code more readable.
Something like
void generate_translation(std::vector<string>& uebersetzung,
std::pair<string, string> const& entry)
{
uebersetzung.push_back(entry.first);
if(entry.first.length() < 4){
uebersetzung.push_back(" ");
}
if(entry.first.length() < 8){
uebersetzung.push_back("\t");
} // These are only so it looks cleaner at the end
uebersetzung.push_back("\t\t\t\t: ");
uebersetzung.push_back(entry.second);
uebersetzung.push_back("\n");
}

Related

C++ - checking a string for all values in an array

I have some parsed text from the Vision API, and I'm filtering it using keywords, like so:
if (finalTextRaw.find("File") != finalTextRaw.npos)
{
LogMsg("Found Menubar");
}
E.g., if the keyword "File" is found anywhere within the string finalTextRaw, then the function is interrupted and a log message is printed.
This method is very reliable. But I've inefficiently just made a bunch of if-else-if statements in this fashion, and as I'm finding more words that need filtering, I'd rather be a little more efficient. Instead, I'm now getting a string from a config file, and then parsing that string into an array:
string filterWords = GetApp()->GetFilter();
std::replace(filterWords.begin(), filterWords.end(), ',', ' '); ///replace ',' with ' '
vector<int> array;
stringstream ss(filterWords);
int temp;
while (ss >> temp)
array.push_back(temp); ///create an array of filtered words
And I'd like to have just one if statement for checking that string against the array, instead of many of them for checking the string against each keyword I'm having to manually specify in the code. Something like this:
if (finalTextRaw.find(array) != finalTextRaw.npos)
{
LogMsg("Found filtered word");
}
Of course, that syntax doesn't work, and it's surely more complicated than that, but hopefully you get the idea: if any words from my array appear anywhere in that string, that string should be ignored and a log message printed instead.
Any ideas how I might fashion such a function? I'm guessing it's going to necessitate some kind of loop.
Borrowing from Thomas's answer, a ranged for loop offers a neat solution:
for (const auto &word : words)
{
if (finalTextRaw.find(word) != std::string::npos)
{
// word is found.
// do stuff here or call a function.
break; // stop the loop.
}
}
As pointed out by Thomas, the most efficient way is to split both texts into a list of words. Then use std::set_intersection to find occurrences in both lists. You can use std::vector as long it is sorted. You end up with O(n*log(n)) (with n = max words), rather than O(n*m).
Split sentences to words:
auto split(std::string_view sentence) {
auto result = std::vector<std::string>{};
auto stream = std::istringstream{sentence.data()};
std::copy(std::istream_iterator<std::string>(stream),
std::istream_iterator<std::string>(), std::back_inserter(result));
return result;
}
Find words existing in both lists. This only works for sorted lists (like sets or manually sorted vectors).
auto intersect(std::vector<std::string> a, std::vector<std::string> b) {
std::sort(a.begin(), a.end());
std::sort(b.begin(), b.end());
auto result = std::vector<std::string>{};
std::set_intersection(std::move_iterator{a.begin()},
std::move_iterator{a.end()},
b.cbegin(), b.cend(),
std::back_inserter(result));
return result;
}
Example of how to use.
int main() {
const auto result = intersect(split("hello my name is mister raw"),
split("this is the final raw text"));
for (const auto& word: result) {
// do something with word
}
}
Note that this makes sense when working with large or undefined number of words. If you know the limits, you might want to use easier solutions (provided by other answers).
You could use a fundamental, brute force, loop:
unsigned int quantity_words = array.size();
for (unsigned int i = 0; i < quantity_words; ++i)
{
std::string word = array[i];
if (finalTextRaw.find(word) != std::string::npos)
{
// word is found.
// do stuff here or call a function.
break; // stop the loop.
}
}
The above loop takes each word in the array and searches the finalTextRaw for the word.
There are better methods using some std algorithms. I'll leave that for other answers.
Edit 1: maps and association
The above code is bothering me because there are too many passes through the finalTextRaw string.
Here's another idea:
Create a std::set using the words in finalTextRaw.
For each word in your array, check for existence in the set.
This reduces the quantity of searches (it's like searching a tree).
You should also investigate creating a set of the words in array and finding the intersection between the two sets.

Deleting from std::list in a nested loop returns access violation

I have a large list of elements, with possible duplicates. I want to delete those duplicates, but my program results in an access violation error after deleting around 700 items.
Here is my code:
for (auto it : endlist){
bool first = true;
for (auto it2 : endlist){
if (!first){
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
rotate( it);
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
rotate(it);
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
rotate(it);
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
}
first = false;
}
}
The access violation is thrown in the second for loop. Can somebody explain why this happens?
Why don't you use
std::list::sort()
then
std::list::unique()
instead? It will get rid of all duplicates in a sorted list.
What you asked for:
for (size_t i=0; i!=endlist.size(); ++i)
{
for (size_t j=i+1; j!=endlist.size(); ++j)// only compare matrices once by using j=i+1
{
if (sometest(endlist[i],endlist[j]))
{
endlist.erase(endlist.begin()+j); // Also resizes the vector.
}
}
}
What you didn't ask:
If you have the ability to change your vector and elements according to your rotation this can be done cleaner with sorting.
For that you'll have to define an operator<(...) for your matrices, this should be possible by comparing their sizes and then comparing them lexicographically. Then you'll want to store the minimal matrix in terms of rotation in your endlist for this to make sense. Once that's done you can use the other answers approach for filtering.
And if you don't want to do anything with the duplicates anyway I'd recommend a container that doesn't allow duplicates from the beginning like a std::map.

Can anyone help me make this function more efficient

So I am trying to sort through an unordered_map container. The container reads input from a file which is a list of people. Each line in the file will be like rCB, bIA, and this will be stored as an element in the map. The second string in each element acts as a pointer to the next person in the list, so later on it will appear again in a new line, in this case:bIA,TDV.
So far I can sort through in order by creating an unordered_map iterator and using the second string in the find function for the iterator to go to the next element. My problem is going the other way. I am able to sort through the opposite way but the way i have implemented my solution means that it takes a very long time to eventually sort through, as we have input files of 3 million people.
list<string> SortEast(unordered_map<string, string> &TempUMap, unordered_map<string, string>::iterator IT, list<string> &TempList)
{
IT = TempUMap.begin();
while (TempList.size() != (TempUMap.size() + 1))
{
if (IT->second == TempList.front())
{
TempList.emplace_front(IT->first);
IT = TempUMap.begin();
}
IT++;
}
return TempList;
}
I've tried to make this more efficient but I cannot think of how. If i could find the value that would go at the start of the list I could sort in order starting with that value, but again I dont know how I would find this value easily.
Any help would be appreciated.
EDIT:
A sample of one of our input is:
rBC,biA
vnN,CmR
CmR,gnz
Dgu,OWn
lnh,Dgu
OWn,YMO
YMO,SIZ
XbL,Cjj
TDV,jew
iVk,vnN
wTb,rBC
jew,sbE
sbE,iVk
Cjj,wTb
AGn,XbL
gnz,SMz
biA,TDV
SIZ,uvD
SMz,lnh
This is only 20 people. In this case AGn is the first value and uvD is the last. The output I end up with is:
AGn
XbL
Cjj
wTb
rBC
biA
TDV
jew
sbE
iVk
vnN
CmR
gnz
SMz
lnh
Dgu
OWn
YMO
SIZ
uvD
As this file starts with rBC, that is the point at which i need to sort backwards
Can you not simply do something like this:
vector<string> orderAllTheNames(const unordered_map<string, string>& input, const string& begin)
{
vector<string> result;
result.reserve(input.size());
string current = begin;
result.push_back(current);
while(result.size() < input.size())
{
current = input[current];
result.push_back(std::move(current));
}
return result;
}
I may have missed some details as I typed this on my phone. You can add some pointers and/or std::moves if you're worried about too many copies flying around.
I guess it's the same as your solution, but without the awkward list and emplace_front.

how to find set of distinct strings from a given string after cyclic shifts?

I am solving a [QUESTION][1] in Codeforces where the problem statement asks me to find the set of all distinct strings from a given string after cyclic shifts.
like for example :
Given string :"abcd"
the output should be 4 ("dabc","cdab", "bcda", "abcd")[note:"abcd" is also counted]
So
t=s[l-1];
for(i=l-1;i>0;i--)
{
s[i]=s[i-1];
}
s[0]=t;
I applied above method for length - 1 times for all possible strings but I am unable to find the distinct ones,
is there any STL function to do this?
You may use the following:
std::set<std::string>
retrieve_unique_rotations(std::string s)
{
std::set<std::string> res;
res.insert(s);
if (s.empty()) {
return res;
}
for (std::size_t i = 0, size = s.size() - 1; i != size; ++i) {
std::rotate(s.begin(), s.begin() + 1, s.end());
res.insert(s);
}
return res;
}
Demo
Not sure about STL specific functions, however a general solution could be to have all shifted strings in a list. Then you sort the list and then you iterate over the list elements. When the current element is different to the last, increment the counter.
There is probably a solution that is less memory intensive. For short strings this solution should be sufficient.
You can use vector for making a list after rotating by using vector.push_back("string"). Before each push, You can check if it already exists by using something like:
if (std::find(vector.begin(), vector.end(), "string") != v.end())
{
increment++;
vector.push_back("string");
}
Or else you can count the elements in the end by vector.size(); and remove increment++.
Hope this helps

Erase duplicate element from a vector

I create a vector inside with several elements in c++ and I want to remove the elements of vector with the same values. Basically, I want to remove the whole index of the vector that is found a duplicate element. My vector is called person. I am trying to do something like:
for(int i=0; i < person.size(); i++){
if(i>0 && person.at(i) == person.at(0:i-1)) { // matlab operator
continue;
}
writeToFile( perason.at(i) );
}
How is it possible to create the operator 0:i-1 to check all possible combinations of indexes?
Edit: I am trying GarMan solution but I got issues in for each:
set<string> myset;
vector<string> outputvector;
for (string element:person)
{
if (myset.find(element) != myset.end())
{
myset.insert(element);
outputvector.emplace_back(element);
}
}
Here is an "in-place" version (no second vector required) that should work with older compilers:
std::set<std::string> seen_so_far;
for (std::vector<std::string>::iterator it = person.begin(); it != person.end();)
{
bool was_inserted = seen_so_far.insert(*it).second;
if (was_inserted)
{
++it;
}
else
{
swap(*it, person.back());
person.pop_back();
}
}
Let me know if this works for you. Note that the order of elements is not guaranteed to stay the same.
Something like this will work
unordered_set<same_type_as_vector> myset;
vector<same_type_as_vector> outputvector;
for (auto&& element: myvector)
{
if (myset.find(element) != myset.end())
{
myset.insert(element);
outputvector.emplace_back(element);
}
}
myvector.swap(outputvector);
Code written into reply box, so might need tweaking.
If you can sort your vector, you can simply call std::unique.
#include <algorithm>
std::sort(person.begin(), person.end());
person.erase(std::unique(person.begin(), person.end()), person.end());
If you cannot sort, you can use a hash-table instead by scanning the vector and update the hash-table accordingly. On the same time, you can easily check if one element is already existent or not in O(1) (and O(n) in total). You don't need to check all other elements for each one, which will be time-costly O(n^2).