C++ Usage of set, iterator, find line where duplicate was found

C++ Usage of set, iterator, find line where duplicate was found - c++

The program adds different strings to a set. The iterator checks the set for a certain string, what i want to achieve is to get the line where the iterator finds this certain string. Is it possible to get this with a set or do i have to create a vector? The reason i use sets is because i also want not to have duplicates in the end. It is a bit confusing i know, i hope you'll understand.
Edit: i want to get the line number of the original element already existing in the set, if a duplicate is found
#include <iostream>
#include <set>
#include <string>
#include <vector>
#include <atlstr.h>
#include <sstream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
set<string> test;
set<string>::iterator it;
vector<int> crossproduct(9, 0);
for (int i = 0; i < 6; i++)
{
crossproduct[i] = i+1;
}
crossproduct[6] = 1;
crossproduct[7] = 2;
crossproduct[8] = 3;
for (int i = 0; i < 3; i++)
{
ostringstream cp; cp.precision(1); cp << fixed;
ostringstream cp1; cp1.precision(1); cp1 << fixed;
ostringstream cp2; cp2.precision(1); cp2 << fixed;
cp << crossproduct[i*3];
cp1 << crossproduct[i*3+1];
cp2 << crossproduct[i*3+2];
string cps(cp.str());
string cps1(cp1.str());
string cps2(cp2.str());
string cpstot = cps + " " + cps1 + " " + cps2;
cout << "cpstot: " << cpstot << endl;
it = test.find(cpstot);
if (it != test.end())
{
//Display here the line where "1 2 3" was found
cout << "i: " << i << endl;
}
test.insert(cpstot);
}
set<string>::iterator it2;
for (it2 = test.begin(); it2 != test.end(); ++it2)
{
cout << *it2 << endl;
}
cin.get();
return 0;
}

"Line number" is not very meaningful to a std::set<string>,
because as you add more strings to the set you may change the
order in which the existing strings are iterated through
(which is about as much of a "line number" as the set::set template
itself will give you).
Here's an alternative that may work better:
std::map<std::string, int> test.
The way you use this is you keep a "line counter" n somewhere.
Each time you need to put a new string cpstot in your set,
you have code like this:
std::map<std::string>::iterator it = test.find(cpstot);
if (it == test.end())
{
test[cpstot] = n;
// alternatively, test.insert(std::pair<std::string, int>(cpstot, n))
++n;
}
else
{
// this prints out the integer that was associated with cpstot in the map
std::cout << "i: " << it->second;
// Notice that we don't try to insert cpstot into the map in this case.
// It's already there, and we don't want to change its "line number",
// so there is nothing good we can accomplish by an insertion.
// It's a waste of effort to even try.
}
If you set n = 0 before you started putting any strings in test then
(and don't mess with the value of n in any other way)
then you will end up with strings at "line numbers" 0, 1, 2, etc.
in test and n will be the number of strings stored in test.
By the way, neither std::map<std::string, int>::iterator nor
std::set<std::string>::iterator is guaranteed to iterate through
the strings in the sequence in which they were first inserted.
Instead, what you'll get is the strings in whatever order the
template's comparison object puts the string values.
(I think by default you get them back in lexicographic order,
that is, "alphabetized".)
But when you store the original "line number" of each string in
std::map<std::string, int> test, when you are ready to
print out the list of strings you can copy the string-integer pairs
from test to a new object, std::map<int, std::string> output_sequence,
and now (assuming you do not override the default comparison object)
when you iterate through output_sequence you will get its
contents sorted by line number.
(You will then probably want to get the string
from the second field of the iterator.)

Related

C++ finding uint8_t in vector<uint8_t>

I have the following simple code. I declare a vector and initialize it with one value 21 in this case. And then i am trying to find that value in the vector using find. I can see that the element "21" in this case is in the vector since i print it in the for loop. However why the iterator of find does not resolve to true?
vector<uint8_t> v = { 21 };
uint8_t valueToSearch = 21;
for (vector<uint8_t>::const_iterator i = v.begin(); i != v.end(); ++i){
cout << unsigned(*i) << ' ' << endl;
}
auto it = find(v.begin(), v.end(), valueToSearch);
if ( it != v.end() )
{
string m = "valueToSearch was found in the vector " + valueToSearch;
cout << m << endl;
}

are you sure it doesn't work?
I just tried it:
#include<iostream> // std::cout
#include<vector>
#include <algorithm>
using namespace std;
int main()
{
vector<uint8_t> v = { 21 };
uint8_t valueToSearch = 21;
for (vector<uint8_t>::const_iterator i = v.begin(); i != v.end(); ++i){
cout << unsigned(*i) << ' ' << endl;
}
auto it = find(v.begin(), v.end(), valueToSearch);
if ( it != v.end() )
{// if we hit this condition, we found the element
string error = "valueToSearch was found in the vector ";
cout << error << int(valueToSearch) << endl;
}
return 0;
}
There are two small modifications:
in the last lines inside the "if", because you cannot add directly a
number to a string:
string m = "valueToSearch was found in the vector " + valueToSearch;
and it prints:
21
valueToSearch was found in the vector 21
while it's true that you cannot add a number to a string, cout
support the insertion operator (<<) for int types, but not uint8_t,
so you need to convert it to it.
cout << error << int(valueToSearch) << endl;
This to say that the find is working correctly, and it is telling you that it found the number in the first position, and for this, it != end (end is not a valid element, but is a valid iterator that marks the end of your container.)
Try it here

How to print uncommon value between 2 array?

I wanna compare value that stored in filename[i] and filename[j] and print out the value in filename[i] that do not have the same filename as in filename[j]. I know it is possible to do using set_difference and sort solution but I do not know exactly to write the sort and set_differences code. Here i provide my original code so that u can test it out and more understand what I'm trying to do.
my full code:
#include <string>
#include <iostream>
#include <ctime> //important when to make random filename- srand(time(0))
#include <opencv2\opencv.hpp> //important when using opencv
#include <vector> //when using vector function
using namespace std;
using namespace cv; //important when using opencv
int main(int argc, char* argv[]) {
vector<String> filenames;
int a, i;
srand(time(0)); //seed random filenames - for random filename
// Get all jpg in the folder
cv::glob("C:\\Users\\x\\Documents\\Aggressive\\abc", filenames);
for (size_t i = 0; i < filenames.size(); i++)
{
Mat im = imread(filenames[i]); //read the filename location
std::cout << "\n";
std::size_t found = filenames[i].find_last_of("//\\");
//std:cout << " file: " << filenames[j].substr(found + 1) << '\n'; //display filename and its format (.jpg)
std::string::size_type const p(filenames[i].substr(found + 1).find_last_of('.')); //eg: 2.jpg then it will find the last '.'
std::string file_without_extension = filenames[i].substr(found + 1).substr(0, p); //eg: 2
std::cout << " file : " << filenames[i].substr(found + 1).substr(0, p); //display filename without .jpg
}
cout << "\n";
cout << "There's " << filenames.size() << " files in the current directory.\n" << endl; // total file in the specific directory
cout << "Enter array size: \n";
cin >> a;
for (int j = 0; j < filenames.size(); j++) {
//generate random filename
int index = rand() % filenames.size(); //random based on total of the file in the directory
//cout << filenames[index] << endl; //display the random number but might be redundant
//swap filenames[j] with filenames[index]
string temp = filenames[j];
filenames[j] = filenames[index];
filenames[index] = temp;
}
for (int j = 0; j < a; j++) {
//cout << "Random image selected:" << filenames[j] << endl; //basically to avoid the redundant random filename
Mat im = imread(filenames[j]); //read filename location
std::size_t found = filenames[j].find_last_of("//\\");
//std:cout << " file: " << filenames[j].substr(found + 1) << '\n'; //display filename and its format (.jpg)
std::string::size_type const p(filenames[j].substr(found + 1).find_last_of('.')); //eg: 2.jpg then it will find the last '.'
std::string file_without_extension = filenames[j].substr(found + 1).substr(0, p); //eg: 2
std::cout << " file: " << filenames[j].substr(found + 1).substr(0, p); //display filename without .jpg
string written_directory = "C:/Users/x/Documents/folder/" + filenames[j].substr(found + 1).substr(0, p) + ".jpg"; // write filename based on its original filename.
imwrite(written_directory, im);
}
return 0;
}

In my opinion this is a perfect example of an XY Problem. From you question, from your code and even from the comments, people do not really understand what you want to do. With that I mean, what do you want to achieve?
It is a vague guess that you want to copy a specified number of random selected JPEG files from one directory to the other. And that you want to show the filenames of the files that will not be copied.
Let me give you some examples, what is the reason for all this confusion.
First and most important, you do not show the full code. Definitions and variable types and functions are misssing an. This is also not a Minimum, Reproducable Example. And the description in your question is hard to understand.
I have two set of array
You have "two set array"? Do you mean, you have 2 [std::set][3] of [std::array][3]. Or maybe you have simply 2 [std::vector][3] of std::string. From what we can see in the code, we could assume a std::vector<std::string>>, but we do not know, because you did not show the feinition of "filenames".
Then, you are talking about "2" something. But we do see only one "filenames". So, 2 or 1?
in a comment you are writing
in the array 2 i had a random filename based on the size of array that an user entered
My guess is that you do not want to have a random filename, but you want to select filenames with a random index from the first vector and put it into a 2nd vector? But we can see only 1 vector "filenames" where you do some random swapping activity.
Then you have written
imread is actually to read the whole file in the folder of directory
This function is very important, what does it do? And what do you mean by "read the file"? Do you mean "filename", so the name of the file? Or the contents of the file? And what is the meaning of "folder of directory"? All filenames in one folder? Or subfolder of a directory entry?
So now my objective is to print out all the file that do not have same filename in the array 2
Again, do we really have 2 arrays(vector)? are they different?
And then, where do you copy the files?
So, you see, it is very hard to understand. Even, if people would like to help you, they cannot, because they do not understand you. Better to show a link to your original home work. Then people can help you. Members here on Stack Overflow want to help. But please allow them to do so.
Here I give you an abstract example for the random selection problem and set_difference problem:
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <random>
int main() {
// Define 2 Vectors for filenames
// This vector is an example for files that could be in a specified directory
std::vector<std::string> fileNamesInDirectory{"8.jpg","5.jpg", "6.jpg", "9.jpg", "1.jpg", "4.jpg", "2.jpg", "3.jpg", };
// Print the filenames as information for the user
for (size_t i = 0U; i < fileNamesInDirectory.size(); ++i) {
std::cout << fileNamesInDirectory[i] << "\n";
}
// Next: Select randomly a given number of filenames from the above vector
// So, first get the number of selections. Inform the user
std::cout << "\nEnter a number of filenames that should be copied randomly. Range: 1-"<< fileNamesInDirectory.size()-1 << "\t";
size_t numberOfSelectedFileNames{};
std::cin >> numberOfSelectedFileNames;
// Check for valid range
if (numberOfSelectedFileNames == 0 || numberOfSelectedFileNames >= fileNamesInDirectory.size()) {
std::cerr << "\n*** Error. Wrong input '" << numberOfSelectedFileNames << "'\n";
}
else {
// Copy all data from fileNamesInDirectory
std::vector<std::string> selection{ fileNamesInDirectory };
// Shuffle the data randomly: Please see here: https://en.cppreference.com/w/cpp/algorithm/random_shuffle
std::random_device rd;
std::mt19937 g(rd());
std::shuffle(selection.begin(), selection.end(), g);
// Resize to the number, given by the user
selection.resize(numberOfSelectedFileNames);
// Now we have a random list of filenames
// Show, what we have so far. Now, because we are learning, we will use the range based for
std::cout << "\n\nOriginal file names:\n";
for (const std::string& s : fileNamesInDirectory) std::cout << s << "\n";
std::cout << "\n\nRandomly selected file names:\n";
for (const std::string& s : selection) std::cout << s << "\n";
// Sort both vectors
std::sort(fileNamesInDirectory.begin(), fileNamesInDirectory.end());
std::sort(selection.begin(), selection.end());
// Show again to the user:3
std::cout << "\n\nOriginal file names sorted:\n";
for (const std::string& s : fileNamesInDirectory) std::cout << s << "\n";
std::cout << "\n\nRandomly selected file names sorted:\n";
for (const std::string& s : selection) std::cout << s << "\n";
// Now, find out the difference of both vectors, meaning, what will not be selected and later copied
std::vector<std::string> difference{};
// Calculate the difference with a std::algorithm: https://en.cppreference.com/w/cpp/algorithm/set_difference
std::set_difference(fileNamesInDirectory.begin(), fileNamesInDirectory.end(), selection.begin(), selection.end(), std::back_inserter(difference));
std::cout << "\n\nThe following file names have not been selected:\n";
for (const std::string& s : difference) std::cout << s << "\n";
}
return 0;
}
If you are more advanced then you can and will use functions from the C++ filesystem library. That will make life easier . . .

How to get the elements of a tuple

I am creating a scrabble game and i need to have a basic score to words on the dictionary.
I used make_tuple and stored it inside my tuple. Is there a way to access elements in a tuple as if it was in a vector?
#include <iostream>
#include <tuple>
#include <string>
#include <fstream>
void parseTextFile()
{
std::ifstream words_file("scrabble_words.txt"); //File containing the words in the dictionary (english) with words that do not exist
std::ofstream new_words_file("test.txt"); //File where only existing words will be saved
std::string word_input;
std::tuple<std::string, int> tupleList;
unsigned int check_integrity;
int counter = 0;
while(words_file >> word_input)
{
check_integrity = 0;
for (unsigned int i = 0; i < word_input.length(); i++)
{
if((int)word_input[i] >= 97 && (int)word_input[i] <= 123) //if the letter of the word belongs to the alphabet
{
check_integrity++;
}
}
if(word_input.length() == check_integrity)
{
new_words_file << word_input << std::endl; //add the word to the new file
tupleList = std::make_tuple(word_input, getScore(word_input)); //make tuple with the basic score and the word
counter++; //to check if the amount of words in the new file are correct
std::cout << std::get<0>(tupleList) << ": " << std::get<1>(tupleList) << std::endl;
}
}
std::cout << counter << std::endl;
}

One would generally use a tuple when there are more than two values of different types to store. For just two values a pair is a better choice.
In your case what you want to achieve seems to be a list of word-value pairs. You can store them in a container like a vector but you can also store them as key-value pairs in a map. As you can see when following the link, an std::map is literally a collection of std::pair object and tuples are a generalization of pairs.
For completeness, if my understanding of your code purpose is correct, these are additions to your code for storing each tuple in a vector - declarations,
std::tuple<std::string, int> correct_word = {};
std::vector<std::tuple<std::string, int>> existing_words = {};
changes in the loop that saves existing words - here you want to add each word-value tuple to the vector,
if(word_input.length() == check_integrity)
{
// ...
correct_word = std::make_tuple(word_input, getScore(word_input));
existing_words.push_back(correct_word);
// ...
}
..and finally example of usage outside the construction loop:
for (size_t iv=0; iv<existing_words.size(); ++iv)
{
correct_word = existing_words[iv];
std::cout << std::get<0>(correct_word) << ": " << std::get<1>(correct_word) << std::endl;
}
std::cout << counter << std::endl;
The same code with a map would look like:
The only declaration would be a map from strings to values (instead of a tuple and vector of tuples),
std::map<std::string, int> existing_words = {};
In the construction loop you would be creating the map pair in a single line like this,
if(word_input.length() == check_integrity)
{
// ...
existing_words[word_input] = getScore(word_input);
// ...
}
While after constructing you would be accessing map elements using .first for the word and .second for the counter. Below is a printing example that also uses a for auto loop:
for (const auto& correct_word : existing_words)
std::cout << correct_word.first << ": " << correct_word.second << std::endl;
std::cout << counter << std::endl;
Notice that maps are by default alphabetically ordered, you can provide your own ordering rules and also use an unordered map if you don't want any ordering/sorting.

vector element compare c++

This program takes a word from text and puts it in a vector; after this it compares every element with the next one.
So I'm trying to compare element of a vector like this:
sort(words.begin(), words.end());
int cc = 1;
int compte = 1;
int i;
//browse the vector
for (i = 0; i <= words.size(); i++) { // comparison
if (words[i] == words[cc]) {
compte = compte + 1;
}
else { // displaying the word with comparison
cout << words[i] << " Repeated : " << compte; printf("\n");
compte = 1; cc = i;
}
}
My problem in the bounds: i+1 may exceed the vector borders. How to I handle this case?

You need to pay more attention on the initial conditions and bounds when you do iteration and comparing at the same time. It is usually a good idea to execute your code using pen and paper at first.
sort(words.begin(), words.end()); // make sure !words.empty()
int cc = 0; // index of the word we need to compare.
int compte = 1; // counting of the number of occurrence.
for( size_t i = 1; i < words.size(); ++i ){
// since you already count the first word, now we are at i=1
if( words[i] == words[cc] ){
compte += 1;
}else{
// words[i] is going to be different from words[cc].
cout << words[cc] << " Repeated : " << compte << '\n';
compte = 1;
cc = i;
}
}
// to output the last word with its repeat
cout << words[cc] << " Repeated : " << compte << '\n';
Just for some additional information.
There are better ways to count the number of word appearances.
For example, one can use unordered_map<string,int>.
Hope this help.

C++ uses zero-based indexing, e.g., an array of length 5 has indices: {0, 1, 2, 3, 4}. This means that index 5 is outside of the range.
Similarly, given an array arr of characters:
char arr[] = {'a', 'b', 'c', 'd', 'e'};
The loop for (int i = 0; i <= std::size(arr); ++i) { arr[i]; } will cause a read from outside of the range when i is equal to the length of arr, which causes undefined behaviour. To avoid this the loop must stop before i is equal to the length of the array.
for (std::size_t i = 0; i < std::size(arr); ++i) { arr[i]; }
Also note the use of std::size_t as type of the index counter. This is common practice in C++.
Now, let's finish with an example of how much easier this can be done using the standard library.
std::sort(std::begin(words), std::end(words));
std::map<std::string, std::size_t> counts;
std::for_each(std::begin(words), std::end(words), [&] (const auto& w) { ++counts[w]; });
Output using:
for (auto&& [word, count] : counts) {
std::cout << word << ": " << count << std::endl;
}

My problem in the bounds: i+1 may exceed the vector borders. How to I
handle this case?
In modern C++ coding, the problem of an index going past vector bounds can be avoided. Use the STL containers and avoid using indices. With a little effort devoted to learning how to use containers this way, you should never see these kind of 'off-by-one' errors again! As a benefit, the code becomes more easily understood and maintained.
#include <iostream>
#include <vector>
#include <map>
using namespace std;
int main() {
// a test vector of words
vector< string > words { "alpha", "gamma", "beta", "gamma" };
// map unique words to their appearance count
map< string, int > mapwordcount;
// loop over words
for( auto& w : words )
{
// insert word into map
auto ret = mapwordcount.insert( pair<string,int>( w, 1 ) );
if( ! ret.second )
{
// word already present
// so increment count
ret.first->second++;
}
}
// loop over map
for( auto& m : mapwordcount )
{
cout << "word '" << m.first << "' appears " << m.second << " times\n";
}
return 0;
}
Produces
word 'alpha' appears 1 times
word 'beta' appears 1 times
word 'gamma' appears 2 times
https://ideone.com/L9VZt6
If some book or person is teaching you to write code full of
for (i = 0; i < ...
then you should run away quickly and learn modern coding elsewhere.

Same repeated words counting using some C++ STL goodies via multiset and upper_bound:
#include <iostream>
#include <vector>
#include <string>
#include <set>
int main()
{
std::vector<std::string> words{ "one", "two", "three", "two", "one" };
std::multiset<std::string> ms(words.begin(), words.end());
for (auto it = ms.begin(), end = ms.end(); it != end; it = ms.upper_bound(*it))
std::cout << *it << " is repeated: " << ms.count(*it) << " times" << std::endl;
return 0;
}
https://ideone.com/tPYw4a

Iterating through two maps in c++

I would like to loop through two maps at the same time, how could I achieve this?
I have two vectors want to print both, can I do two time (auto it : mymap) within one for? Something like:
for (auto it: mymap && auto on: secondMap)
is this even allowed?
I am trying to print values like (value1, value2) where each of the values is in a different map. The maps do not necessarily contain the exact same items but the key is an Instruction and the value is an integer, so if I have a element in the map for value2, then not necessarily there is a value1 corresponding to the same key, but in that case it should be 0 which is the default integer value.
Any ideas?
Perhaps it is possible to combine two iterators, one for each map?
Kind regards,
Guus Leijsten

You can use the regular for-loop for this :
#include <iostream>
#include <map>
int main(int argc, char* argv[]) {
std::map<int, std::string> m1, m2;
m1.insert({15, "lala"});
m1.insert({10, "hey!"});
m1.insert({99, "this"});
m2.insert({50, "foo"});
m2.insert({51, "bar"});
for(auto it_m1 = m1.cbegin(), end_m1 = m1.cend(),
it_m2 = m2.cbegin(), end_m2 = m2.cend();
it_m1 != end_m1 || it_m2 != end_m2;)
{
if(it_m1 != end_m1) {
std::cout << "m1: " << it_m1->first << " " << it_m1->second << " | ";
++it_m1;
}
if(it_m2 != end_m2) {
std::cout << "m2: " << it_m2->first << " " << it_m2->second << std::endl;
++it_m2;
}
}
return EXIT_SUCCESS;
}
Note that because you want to iterate over maps of different size, you have to use the || operator in loop condition. The direct consequence is that you cannot increment in the last part of the for-loop, as one of the iterator may be invalid at that time (and lead to a segmentation fault).
You have to check iterator validity inside the loop and increment it when it's valid, as shown in the sample above.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ Usage of set, iterator, find line where duplicate was found - c++

Related

C++ finding uint8_t in vector<uint8_t>

How to print uncommon value between 2 array?

How to get the elements of a tuple

vector element compare c++

Iterating through two maps in c++

Categories

Resources