What I'm confused about is that I have a map which is made up of size_t of string as the key, and strings as the value.
std::multimap<size_t, std::string> wordMap;
Then I have a pair that stores the equal_range for all strings with size of 4. Then I want to iterate through the start of that equal_range to the end of that equal_range. The start is my pair.first and end is my pair.second. How would I iterate through every character that my pair.first points too and then compare that to every word in between pair.first and pair.second ?
pair<multimap<size_t, string>::iterator, multimap<size_t, string>::iterator> key_range;
key_range = wordMap.equal_range(n);
Basically I want to compare every letter in word1 to every character in word2.
Advance itr2 which is word2 to the next word and compare every letter in that to every letter in word1. Do this for every word then advance itr1 which is word1 to another word and compare that to every word.
How would I get every character itr2 points to? I think the first for loop accomplishes this for the first iterator but I don't know how to do it for itr2.
for (word_map::iterator itr = key_range.first; itr != key_range.second; itr++) { //this loop will iterate through every word to be compared
for (word_map::iterator itr2 = next(key_range.first); itr2 != key_range.second; itr2++) { //this loop will iterate through every word being compared against itr1
int i = 0;
int hit = 0;
for (char& c1 : itr->first) {
char& c2{ (itr2)->first[i] };
if(c1 != c2)
hit++;
i++;
}
}
I'd like to compare every letter in every word against each other as long as they have the same string size. Then if hit == 1 that means the words are only off by 1 character and they should be mapped or stored in some type of STL container that groups them. I'm still new to STL so i'm thinking a set but I need to read more into it.
First, you'd be more likely to get assistance if you provided a minimal compilable example. I'm assuming your words are std::strings for this answer, but you know what they say about assuming.
There are algorithms like "zip" which is implemented in Boost specifically for iterating over mulitple collections simultaneously, but I don't think there's anything comparable in the standard library.
A simple but unpleasantly fiddly approach would be just to manually iterate through both strings. This will output each letter in the two words until either one word ends, or there's a difference.
Note all the fiddly bits: you need to make sure both iterators are valid at all times in case one word ends before the other, and working out what actually happened is a bit cumbersome.
#include <string>
#include <iostream>
int main()
{
std::string word1 = "capsicum";
std::string word2 = "capsicube";
std::string::iterator it1 = word1.begin();
std::string::iterator it2 = word2.begin();
while (it1 != word1.end() && it2 != word2.end())
{
// characters are different!
if (*it1 != *it2)
break;
// characters are the same
std::cout << "Both have: " << *it1 << std::endl;
// advance both iterators
++it1;
++it2;
}
if (it1 == word1.end() && it2 == word2.end())
{
std::cout << "Words were the same!" << std::endl;
}
else if (it1 == word1.end())
{
std::cout << "Word 1 was shorter than word 2." << std::endl;
}
else if (it2 == word2.end())
{
std::cout << "Word 1 was longer than word 2." << std::endl;
}
else
{
std::cout << "Words were different after position " << (it1 - word1.begin())
<< ": '" << *it1 << "' vs '" << *it2 << "'" << std::endl;
}
}
New answer, since the question was significantly updated. I'm still not sure this will do exactly what you want, but I think you should be able to use it to get where you want to go.
I've written this as a minimal, complete, verifiable example, which is ideally how you should pose your questions. I've also used C++11 features for brevity/readability.
Hopefully the inline comments will explain things sufficiently for you to at least be able to do your own research for anything you don't fully understand, but feel free to comment if you have any more questions. The basic idea is to store the first word (using the key_range.first iterator), and then start iterating from the following iterator using std::next(), until we reach the end iterator in key_pair.second.
This then gives us word1 outside of the loop, and word2 within the loop which will be set to every other word in the list. We then use the "dual interation" technique I posted in my other answer to compare each word character by character.
#include <map>
#include <string>
#include <iostream>
int
main()
{
std::multimap<size_t, std::string> wordMap;
wordMap.insert({4, "dogs"});
wordMap.insert({4, "digs"});
wordMap.insert({4, "does"});
wordMap.insert({4, "dogs"});
wordMap.insert({4, "dibs"});
// original type declaration...
// std::pair<std::multimap<size_t, std::string>::iterator, std::multimap<size_t, std::string>::iterator> key_range;
// C++11 type inference...
auto key_range = wordMap.equal_range(4);
// make sure the range wasn't empty
if (key_range.first == key_range.second)
{
std::cerr << "No words in desired range." << std::endl;
return 1;
}
// get a reference to the first word
std::string const& word1 = key_range.first->second;
std::cout << "Comparing '" << word1 << "' to..." << std::endl;
// loop through every iterator from the key_range, skipping for the first
// (since that's the word we're comparing everything else to)
for (auto itr = std::next(key_range.first); itr != key_range.second; ++itr)
{
// create a reference for clarity
std::string const& word2 = itr->second;
std::cout << "... '" << word2 << "'";
// hit counter; where hit is defined as characters not matching
int hit = 0;
// get iterators to the start of each word
auto witr1 = word1.begin();
auto witr2 = word2.begin();
// loop until we reach the end of either iterator. If we're completely
// confident the two words are the same length, we could only check
// one of them; but defensive coding is a good idea.
while (witr1 != word1.end() && witr2 != word2.end())
{
// dereferencing the iterators will yield a char; so compare them
if (*witr1 != *witr2)
++hit;
// advance both iterators
++witr1;
++witr2;
}
// do something depending on the number of hits
if (hit <= 1)
{
std::cout << " ... close enough!" << std::endl;
}
else
{
std::cout << " ... not a match, " << hit << " hits." << std::endl;
}
}
}
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 11 months ago.
Improve this question
#define FIRST 'a'
#define LAST 'd'
#define ALL ((1 << (LAST-FIRST+1)) - 1)
int main()
{
string s1, s2;
s1 = "aabcd";
s2 = "caabd";
// getline(cin, s1);
// getline(cin, s2);
// Build masks
int mask[1 << CHAR_BIT] = {};
for (char c = FIRST; c <= LAST; ++c)
{
auto it1 = s1.begin(), it2 = s2.begin();
bool done = false, fail = false;
int count[1 << CHAR_BIT] = {};
mask[c] = ALL;
// cout << "c:" << c << " " << ALL << " " << mask[c] << '\n';
do {
// Count characters until next match for c
while (it1 != s1.end() && *it1 != c) ++count[(unsigned char)*it1++];
Can someone explain how the last sentence works (while (it1 != s1.end() && *it1 != c) ++count[(unsigned char)*it1++]; in layman terms. Also, count[1 << CHAR_BIT]. How do I print the contents of the count[] out?
I wrote that code as part of this answer.
The part you're asking about is this:
auto it1 = s1.begin();
int count[1 << CHAR_BIT] = {};
do {
while (it1 != s1.end() && *it1 != c) ++count[(unsigned char)*it1++];
} while (...);
First, it obtains an iterator to the beginning of string s1. It will be used to walk through the string.
auto it1 = s1.begin();
Then it intitializes a histogram as the array count which holds one integer for every possible character, and all values are initialized to zero.
int count[1 << CHAR_BIT] = {};
Let's break down the while-loop in parts:
it1 != s1.end() tests that the iterator has not reached the end of the string
*it1 != c dereferences the iterator (obtaining the character at that position) and compares it to c
++count[(unsigned char)*it1++]; increases the count array at the index given by the current character, and also moves the iterator to the next position in the string.
So what this does is read characters until reaching the end of the string or encountering the character c, and counts all the characters it visited (not including c).
This is only the first half of the Dynamic Programming algorithm I outlined in the original answer. The second half is to then walk through the second string using the same rules and "uncount" all characters visited.
After both strings have been walked to the next occurrence of c, the count array will contain up-to-date information about whether the number of each character seen so far in each string is equal. The mask is then updated, and the outer loop continues stepping through those two strings until the iterators reach the end.
Regarding your other question:
how do i print the content of the count[] out?
Like this:
for (char c = FIRST; c <= LAST; ++c)
{
cout << "count(" << c << ") = " << count[(unsigned char)c] << "\n";
}
it1++: this is a post-increment statement. This means that the returned value from the operation it1++ would be the previous value held by it1 (meaning, before the increment).
*it1++: the * operator, when applied to iterators, accesses its contents. So, here we're accessing the contents of the previous iterator (meaning, the one before the increment took place).
(unsigned char)*it1++: cast the value contained in the previous iterator to unsigned char (probably from char; this is to avoid negative values as an array index).
count[(unsigned char)*it1++]: use the previous value as an index for the count array.
++count[(unsigned char)*it1++]: this is a pre-increment statement. This means that the return value will be the result of incrementing the value contained at the given index. Since the array is default initialized using braces ({}), all values will initially evaluate to 0. So, adding:
cout << ++count[(unsigned char)*it1++] << "\n"; will print the actual count for that letter, with a minimum of 1 since it starts at 0 and it's a pre-increment.
I have a list
list<pair<Zeitpunkt, double>> l_tempdiff;
And I only want to cout the first 5 elements.
I only know the way of couting the whole list with:
for (auto elem : l_tempdiff)
{
cout << elem.first << elem.second << endl;
}
I dont know how to acces my elements when I use:
for (it = l_tempdiff.begin(); it != l_tempdiff.end(); ++it)
{
}
And I guess I need to change the l_tempdiff.end() to some other value but it doesnt seem to take just the number5`. How can I do this?
Since std::list iterators are not random access you cannot just increment them like l_tempdiff.begin() + 5. What you can do is use std::next to increment the iterator the required number of times. That would looks like
for (auto it = l_tempdiff.begin(), end = std::next(l_tempdiff.begin(), 5); it != end; ++it)
{
// use `*it` here
}
Before doing this though you should make sure the list is big enough because if it isn't then you'll have undefined behavior.
You only want to output the first five elements?
Well, a for-range-loop is a good place to start, just add the additional constraint as a break-condition:
int i = 0;
for (auto&& elem : l_tempdiff)
{
if (5 < ++i) break;
cout << elem.first << elem.second << endl;
}
I change auto to auto&& to avoid needless copying.
As an aside, consider reading "Why is "using namespace std" considered bad practice?" and "C++: "std::endl" vs "\n"".
list<pair<Zeitpunkt,double> > :: iterator it;
int m = 0;
it = l_tempdiff.begin();
while( it != l_tempdiff.end() && m < 5)
{
cout<<it->second<<"\n";
m++;
it++;
}
Try
auto it = l_tempdiff.begin();
auto end = l_tempdiff.end();
for (int count = 0; count < 5 && it != end; ++count)
{
std::cout << it->first << it->second << std::endl;
std::advance(it);
}
This prints the first five pairs (or all the pairs, if there are less than 5).
count is used to control the maximum number of elements to be printed.
it is an iterator that, in each iteration of the loop, references the current pair.
Note that advancing an end iterator gives undefined behaviour. So it is necessary to terminate the loop if the end iterator is reached (hence the it != end test in the loop condition) or if the maximum number of elements (5) is reached.
Stuck in very interesting problem.
You might have done this before in C/C++
map<string, string> dict;
dsz = dict.size();
vector<string> words;
int sz = words.size();
for(int i = 0; i < sz; ++i)
{
for(int j = i + 1; j < dsz; ++j)
{
}
}
How I will achieve the same thing using iterator.
Please suggest.
Ok.
I figure it out.
more precisely I wanted both i and j in inner loop.
here i did with iterator, sorry I have to move to multimap instead of map due to change in requirement.
vector<string>::iterator vit;
multimap<string, string>::iterator top = dict.begin();
multimap<string, string>::iterator mit;
for(vit = words.begin(); vit != words.end(); ++vit)
{
string s = *vit;
++top;
mit = top;
for(; mit != dict.end(); ++mit)
{
/* compare the value with other val in dictionary, if found, print their keys */
if(dict.find(s)->second == mit->second)
cout << s <<" = "<< mit->first << endl;
}
}
Any other efficient way to do this will be grateful.
Your final intent is not fully clear, because you start the j loop on i+1 (see comments at the end). Until you give clarity on this relationship, I propose you two interim solutions
Approach 1: easy and elegant:
You use the new C++11 range based for(). It makes use of an iterator starting with begin() and going until end(), without you having to bother with this iterator:
for (auto x : words) { // loop on word (size sz)
for (auto y : dict) { // loop on dict (size dsz)
// do something with x and y, for example:
if (x==y.first)
cout << "word " << x << " matches dictionary entry " << y.second << endl;
}
}
Approach 2: traditional use of iterators
You cas also specify explicitely iterators to be used. This is a little bit more wordy as the previous example, but it allows you to choose the best suitable iterator, for example if you want constant iterator like cbegin() instead of begin(), if you want to skip some elements or use an adaptator on the iterator, suc as for example reverse_iterator, etc.:
for (auto itw = words.begin(); itw != words.end(); itw++) {
for (auto itd = dict.begin(); itd != dict.end(); itd++) {
// do simething with *itw and *itd, for example:
if (*itw == itd->first)
cout << "word " << *itw << " matches dictionary entry " << itd->second << endl;
}
}
Remarks:
The starting of intter loop with j=i+1 makes sense only if elements of word vector are related to elements in dict map (ok, they are cerainly words as well), AND if the order of elements you access in the map is related to the order in the vector. As map is ordered according to the key, this would make sense only word would be ordered as well following the same key. Is it the case ?
If you'd still want to skip elements or make calculation based on distance between elements , you'd rather consider the second approach propose above. It makes it easier to use distance(itw, words.begin()) which would be the equivalent of i.
However, it's best to use containters taking advantage of their design. So instead of iterating trough a dictionaly map to find a word entry, it's better to do use the map as follows:
for (auto x : words) { // loop on word (size sz)
if (dict.count(x)) // if x is in dictionary
cout << "word " << x << " matches dictionary entry " << dict[x] << endl;
}
My homework is remove duplicates in a random string. My idea is use 2 loops to solve the problem.
1st one will scan every character in the string.
2nd one will check that character is duplicated or not. If so, remove the character.
string content = "Blah blah..."
for (int i = 0; i < content.size(); ++i) {
char check = content.at(i);
for (int j = i + 1; j < content.size() - 1; ++j) {
if (check == content.at(j)) {
content.erase(content.begin()+j);
}
}
}
The problem is it doesn't work. It always removes the wrong character. Seems an indices problem but I don't understand why.
A temporary fix is change content.erase(content.begin()+j); to content.erase( remove(content.begin() + i+1, content.end(), check),content.end());
But I think trigger a "remove by value" scan isn't a nice way. I want to do it with 2 loops or fewer.
Any ideas will be appreciated :)
Your loops could look the following way
#include <iostream>
#include <string>
int main()
{
std::string s = "Blah blah...";
std::cout << '\"' << s << '\"' << std::endl;
for ( std::string::size_type i = 0; i < s.size(); i++ )
{
std::string::size_type j = i + 1;
while ( j < s.size() )
{
if ( s[i] == s[j] )
{
s.erase( j, 1 );
}
else
{
++j;
}
}
}
std::cout << '\"' << s << '\"' << std::endl;
return 0;
}
The output is
"Blah blah..."
"Blah b."
There are many other approaches using standard algorithms. For example
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
int main()
{
std::string s = "Blah blah...";
std::cout << '\"' << s << '\"' << std::endl;
auto last = s.end();
for ( auto first = s.begin(); first != last; ++first )
{
last = std::remove( std::next( first ), last, *first );
}
s.erase( last, s.end() );
std::cout << '\"' << s << '\"' << std::endl;
return 0;
}
The output is the same as for the previous code example
"Blah blah..."
"Blah b."
If use of STL is a possible option, you could use an std::unordered_set to keep the characters seen so far and the erase-remove idiom with std::remove_if, like in the following example:
#include <iostream>
#include <string>
#include <unordered_set>
#include <algorithm>
int main() {
std::string str("Hello World!");
std::unordered_set<char> log;
std::cout << "Before: " << str << std::endl;
str.erase(std::remove_if(str.begin(), str.end(), [&] (char const c) { return !(log.insert(c).second); }), str.end());
std::cout << "After: " << str << std::endl;
}
LIVE DEMO
I recommend a two pass approach. The first pass identifies the positions of the duplicated characters; the second pass removes them.
I recommend using a std::set and a std::vector<unsigned int>. The vector contains letters that are in the string. The vector contains the positions of the duplicated letters.
The first pass detects if a letter is present in the set. If the letter exists, the position is appended to the vector. Otherwise the letter is inserted into the set.
For the second pass, sort the vector in descending order.
Erase the character at the position in the vector, then remove the position from the vector.
By erasing characters from the end of the string towards the front, the positions of the remaining duplicates won't change when the character is erased from the string.
I am not sure that this is what is causing your problem, but another problem that I see with your code is in your second for loop. Your j < content.size() - 1 statement should just be
j < content.size().
The reasoning for this is a little tricky to see at first, but in this case you are not just getting the size of your vector to act as the size, but to act as the ending indices of your string. You are shortening the last indices by one which means you wont hit the last char in your string. I don't know if this will help your initial problem, but who knows?
Note: Your actual problem is maintaining a proper index to the next element in question:
If you do not erase a character, the next element is at the next position.
If you erase a character, the next element will move into the place of the current position (the position stays the same).
Also: There are more efficient solutions (eg.: utilizing a set)
Ok so I kindof have 2 questions but I think they're sortof easy to an experienced programmer and somewhat similar. If it bothers you, simply help me with 1 question & not the other. Basically I have a map(char, int) that associates the number of times a character appears in a string with an int value. The problem I'm having is I can't figure out how to print out the associated values from most occuring to least occuring. For example, if i type aabbbcddddd. I get a:2 b:3 c:1 d:5. But what I'm trying to get is d:5 b:3 a:2 c:1. I hope I'm explaining it okay...
Second question. I was also wondering how would one go about making a maps that does the same thing as above but with a series of letters OR numbers. Example: with string: 'aabbb001c1 ddd'... "aabbb", "c", and "ddd" would all be seperate words. "001" and "1" would be numbers, however they would not be equal. I tried using two seperate map(string, int) for this (one for words one for numbers), with a series cutting off when a character that's not of its "type" appeared, but nothings working. A technique or any advice would be nice. Here's the code I have so far.
#include <iostream>
#include <string>
#include <sstream>
#include <map>
#include <stdio.h>
#include <ctype.h>
#include <algorithm>
using namespace std;
int main()
{
string word;
getline(cin, word);
map<char,int> charCount;
map<string, int> strCount;
map<string, int> numCount;
//turning all characters to lower case
transform(word.begin(), word.end(), word.begin(), ::tolower);
//for loop to count recurring characters
for (unsigned int i=0; i<word.size(); i++){
charCount[word[i]]++;
}
//Having trouble here. This is where i'm doing my series of words & numbers
string temp;
string temp2;
for (unsigned int j=0; j<word.size(); j++){
if (isalpha(word[j]))
temp = temp + word[j];
else{
wordCount[temp]++;
temp2.clear();
}
if (isdigit(word[j]))
temp2 = temp2 + word[j];
else{
numCount[temp2]++;
temp2.clear();
}
}
//print out chars
for (map<char, int>::iterator it = charCount.begin(); it != charCount.end(); ++it)
cout << it->first << ": " << it->second << endl;
//print out words
for (map<string, int>::iterator it = wordCount.begin(); it != wordCount.end(); ++it)
cout << it->first << ": " << it->second <<endl;
//print out numbers
for (map<string, int>::iterator it = numCount.begin(); it != numCount.end(); ++it)
cout << it->first << ": " << it->second << endl;
return 0;
}
A std::map is sorted by it's key, not it's value, and neither the sorting mechanism nor the values of the keys can be changed after the map has been instantiated.
So your map is already sorted by char, but you want to display it the value -- a different sorting order.
The simplest thing to do would be to construct a std::multimap where the key is the number of occurrences (the value from the original map) and the value is the character. Using a multimap as opposed to a map allows you to have multiple characters with the same number of occurrences. When you're ready to display the values, copy the map keys and values to the multimap and then display the contents of the multimap.
Here's an example (Live Demo):
#include <iostream>
#include <iomanip>
#include <map>
#include <string>
#include <utility>
using namespace std;
int main() {
const string data = "foofaaster";
// populate the map
map <char, int> chars;
for (auto c = data.begin(); c != data.end(); ++c)
chars [*c]++;
// Now display the occurances
cout << "Original data: '" << data << "'" << endl;
multimap <int, char, greater <int>> counts;
for (auto c = chars.begin(); c != chars.end(); ++c)
counts.insert (make_pair (c->second, c->first));
cout << "Character counts:" << endl;
for (auto it = counts.begin(); it != counts.end(); ++it)
cout << "\t" << it->first << ": '" << it->second << "'" << endl;
return 0;
}
As for the second question, if your map's keys are static and there's one value per key, then you can use a std::map <std::string, int>. Otherwise, if subelements of the key will have thier own values, you might consider another data structure more appropriate for the task, such as a trie.