How are pairs stored in memory in C++? - c++

I was just reading about pairs in C++ when this doubt stroke my mind that how the pairs are stored in memory and id the identifier assigned to the pairs a object or something else.
pls explain how an array containing pair uses memory to save the pairs and how can we iterate through the that array, by accessing each pair;

As for the pair itself, if you take a look at the standard library source code you'll just notice, that after cutting all the boilerplate, the for the most trivial case std::pair is just a simple class template:
template<typename First, typename Second>
struct pair
{
First first;
Second second;
};
Now, all the boilerplate is there to ensure the all the special functions like comparison, assignment, copy construction etc. are performed with minimal overhead.
But for the sake of mental model one can think of this simple struct.
As for "array of pairs" - I'm not sure I follow, really.
std::array<std::pair<X,Y>, SIZE>/std::vector<std::pair<X,Y>> behaves just as it would for any other type, i.e. it store the pairs in contiguous memory block, end of story.
Same about iteration, there's nothing special about it:
std::array<std::pair<char, int>, 3> pairs{
std::pair{'a', 1},
std::pair{'b', 2},
std::pair{'c', 3}};
for (const auto& p:pairs){
std::cout << p.first << " " << p.second << "\n";
}
demo

Take a look at this example:
#include <iostream>
#include <array>
#include <iterator>
int main( )
{
std::array< std::pair<char, char>, 10 > arrayOfPairs { };
std::cout << "size of array: " << sizeof( arrayOfPairs ) << "\n\n";
for ( size_t idx { }; idx < arrayOfPairs.size( ); ++idx) // fill the array with
// std::pair objects
{
arrayOfPairs[ idx ] = std::make_pair<char, char>( 'a', 'b' + idx );
}
std::cout << "key" << " " << "value" << '\n';
for ( const auto& p : arrayOfPairs ) // print the keys and values
{
std::cout << " " << p.first << " " << p.second << '\n';
}
return 0;
}
Output:
size of array: 20
key value
a b
a c
a d
a e
a f
a g
a h
a i
a j
a k
In this example, each pair object consists of two chars, so the size is 2 bytes. arrayOfPairs has the space for 10 pair objects which means that its size is 10 * 2 == 20 bytes. In an std::pair object, the key and its value are stored besides each other. It acts like a simple struct.

Related

Access the values (array) of unordered_map c++

I am trying to use the unordered_map in C++, such that, for the key I have a string, while for the value there is an array of floats.
std::unordered_map<std::string, std::array<float, 3>> umap;
But, I am not sure how to access the array of values. I know to access the elements, an iterator is an option, but how specifically elements of an array can be accessed?
I am trying to assign these array values to different array (std::array mapArrayVal)
I tried using
for (auto i = umap.begin(); i != umap.end(); i++)
{
std::array<float, 3> mapArrayVal = (i->second.first, i->second.second,
i>second.third);
}
is the correct way? Any help is appreciated, TIA!
This example shows you how to do it with some comments to help you on your way :
#include <array>
#include <iostream>
#include <string>
#include <unordered_map>
int main()
{
// a map consists of key,value pairs in your case
// the key will have a type of std::string
// the value will be an std::array (with three entries)
std::unordered_map<std::string, std::array<float, 3>> umap{
{"key1", {1.0,2.0,3.0}},
{"key2", {4.0,5.0,6.0}}
};
// iterate over all entries using an explicitly type it
// normally you would type auto i.o. std::unordered_map<std::string, std::array<float, 3>>::iterator
// but this shows all the types involved
for (std::unordered_map<std::string, std::array<float, 3>>::iterator it = umap.begin(); it != umap.end(); ++it)
{
// access key/values by iterator it->second will be the array
std::cout << "key = `" << it->first << "`, values : {" << (it->second)[0] << ", " << (it->second)[1] << ", " << (it->second)[2] << "}\n";
}
// however with c++ you could do it in a much more readable way
// combine range based for loop : https://en.cppreference.com/w/cpp/language/range-for
// with structured binding : https://en.cppreference.com/w/cpp/language/structured_binding
// the key_value_pair is const since you only want to observe it for printing
for (const auto& [key, values] : umap)
{
std::cout << "key = `" << key << "`, values : {" << values[0] << ", " << values[1] << ", " << values[2] << "}\n";
}
// use at(key) in map don't use operator[] it may insert an "empty" item in the map if something isn't found there!
auto& reference_to_array_in_map = umap.at("key1");
// this is how you make a copy of the array
std::array<float, 3> copied_values{ reference_to_array_in_map };
for (const float value : copied_values)
{
std::cout << value << " ";
}
std::cout << "\n";
return 0;
}

Surprising behaviour with an unordered_set of pairs

How can the unordered_set can hold both (0, 1) and (1, 0) if they have the same hash value?
#include <iostream>
#include <unordered_set>
#include <utility>
using namespace std;
struct PairHash
{
template <class T1, class T2>
size_t operator()(pair<T1, T2> const &p) const
{
size_t hash_first = hash<T1>{}(p.first);
size_t hash_second = hash<T2>{}(p.second);
size_t hash_combined = hash_first ^ hash_second;
cout << hash_first << ", " << hash_second << ", " << hash_combined << endl;
return hash_combined;
}
};
int main()
{
unordered_set<pair<int, int>, PairHash> map;
map.insert({0, 1});
map.insert({1, 0});
cout << map.size() << endl;
for (auto& entry : map) {
cout << entry.first << ", " << entry.second << endl;
}
return 0;
}
Output:
0, 1, 1
1, 0, 1
2
1, 0
0, 1
Link to onlinegdb.
unordered_set can hold one instance of any unique data-value; it is not limited to only holding data-values with unique hash-values. In particular, when two data-values are different (according to their == operator) but both hash to the same hash-value, the unordered_set will make arrangements to hold both of them regardless, usually at a slightly reduced efficiency (since any hash-based lookups for either of them will internally hash to a data structure that holds both of them, which the unordered_set's lookup-code will have to iterate over until it finds the one it is looking for)

How to get the elements of a tuple

I am creating a scrabble game and i need to have a basic score to words on the dictionary.
I used make_tuple and stored it inside my tuple. Is there a way to access elements in a tuple as if it was in a vector?
#include <iostream>
#include <tuple>
#include <string>
#include <fstream>
void parseTextFile()
{
std::ifstream words_file("scrabble_words.txt"); //File containing the words in the dictionary (english) with words that do not exist
std::ofstream new_words_file("test.txt"); //File where only existing words will be saved
std::string word_input;
std::tuple<std::string, int> tupleList;
unsigned int check_integrity;
int counter = 0;
while(words_file >> word_input)
{
check_integrity = 0;
for (unsigned int i = 0; i < word_input.length(); i++)
{
if((int)word_input[i] >= 97 && (int)word_input[i] <= 123) //if the letter of the word belongs to the alphabet
{
check_integrity++;
}
}
if(word_input.length() == check_integrity)
{
new_words_file << word_input << std::endl; //add the word to the new file
tupleList = std::make_tuple(word_input, getScore(word_input)); //make tuple with the basic score and the word
counter++; //to check if the amount of words in the new file are correct
std::cout << std::get<0>(tupleList) << ": " << std::get<1>(tupleList) << std::endl;
}
}
std::cout << counter << std::endl;
}
One would generally use a tuple when there are more than two values of different types to store. For just two values a pair is a better choice.
In your case what you want to achieve seems to be a list of word-value pairs. You can store them in a container like a vector but you can also store them as key-value pairs in a map. As you can see when following the link, an std::map is literally a collection of std::pair object and tuples are a generalization of pairs.
For completeness, if my understanding of your code purpose is correct, these are additions to your code for storing each tuple in a vector - declarations,
std::tuple<std::string, int> correct_word = {};
std::vector<std::tuple<std::string, int>> existing_words = {};
changes in the loop that saves existing words - here you want to add each word-value tuple to the vector,
if(word_input.length() == check_integrity)
{
// ...
correct_word = std::make_tuple(word_input, getScore(word_input));
existing_words.push_back(correct_word);
// ...
}
..and finally example of usage outside the construction loop:
for (size_t iv=0; iv<existing_words.size(); ++iv)
{
correct_word = existing_words[iv];
std::cout << std::get<0>(correct_word) << ": " << std::get<1>(correct_word) << std::endl;
}
std::cout << counter << std::endl;
The same code with a map would look like:
The only declaration would be a map from strings to values (instead of a tuple and vector of tuples),
std::map<std::string, int> existing_words = {};
In the construction loop you would be creating the map pair in a single line like this,
if(word_input.length() == check_integrity)
{
// ...
existing_words[word_input] = getScore(word_input);
// ...
}
While after constructing you would be accessing map elements using .first for the word and .second for the counter. Below is a printing example that also uses a for auto loop:
for (const auto& correct_word : existing_words)
std::cout << correct_word.first << ": " << correct_word.second << std::endl;
std::cout << counter << std::endl;
Notice that maps are by default alphabetically ordered, you can provide your own ordering rules and also use an unordered map if you don't want any ordering/sorting.

Reference to a partial segment of a vector?

I have a black box C++ function which I don't have access to its source code:
void blackbox(vector<int> &input);
This function modifies the element of the input vector in an unknown manner.
The problem I have now is that I want to apply the black box function only for a partial segment of a vector, for example,
the last 500 elements of a vector. So, this is the routine that I wrote to attain this goal:
vector<int> foo (5,1000);
vector<int> bar (foo.end()-500,foo.end());
blackbox(bar);
swap_ranges(foo.end()-500,foo.end(),bar.begin());
This code may work, but is there a better way to do this?
It would be good if I can define a vector reference only for a segment of
an existing vector, instead of creating a copy.
I am not so comfortable with the copying and swapping parts in the above code; since this routine is
invoked so frequently, I think the repeated copying and swapping slows down the code.
If I knew the exact operations done by the block box, I would rewrite the function so that it takes vector iterators as the input
arguments. Unfortunately, this is not possible at the moment.
There's no well-defined way to achieve this functionality. With huge caveats and warnings, it can (for one GCC version at least) be hacked as below, or you could perhaps write something with better defined behaviour but based on your compiler's current std::vector implementation....
So... hacked. This will not work if insert/erase/resize/reserve/clear/push_back or any other operation affecting the overall vector is performed. It may not be portable / continue working / work with all optimisation levels / work on Tuesdays / use at own risk etc.. It depends on the empty base class optimisation.
You need a custom allocator but there's a catch: the allocator can't have any state or it'll change the binary layout of the vector object, so we end up with this:
#include <iostream>
#include <vector>
template <typename Container> // easy to get this working...
void f(Container& v)
{
std::cout << "f() v.data() " << v.data() << ", v.size() " << v.size() << '\n';
for (int& n : v) n += 10;
}
void g(std::vector<int>& v) // hard to get this working...
{
std::cout << "g() v.data() " << v.data() << ", v.size() " << v.size() << '\n';
for (int& n : v) n += 100;
}
int* p_; // ouch: can't be a member without changing vector<> memory layout
struct My_alloc : std::allocator<int>
{
// all no-ops except allocate() which returns the constructor argument...
My_alloc(int* p) { p_ = p; }
template <class U, class... Args>
void construct(U* p, Args&&... args) { std::cout << "My_alloc::construct(U* " << p << ")\n"; }
template <class U> void destroy(U* p) { std::cout << "My_alloc::destroy(U* " << p << ")\n"; }
pointer allocate(size_type n, std::allocator<void>::const_pointer hint = 0)
{
std::cout << "My_alloc::allocate() return " << p_ << "\n";
return p_;
}
void deallocate(pointer p, size_type n) { std::cout << "deallocate\n"; }
template <typename U>
struct rebind { typedef My_alloc other; };
};
int main()
{
std::vector<int> v = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
std::cout << "main() v.data() " << v.data() << '\n';
My_alloc my_alloc(&v[3]); // first element to "take over"
std::vector<int, My_alloc> w(3, my_alloc); // num elements to "take over"
f(w);
g(reinterpret_cast<std::vector<int>&>(w));
for (int n : v) std::cout << n << ' ';
std::cout << '\n';
std::cout << "sizeof v " << sizeof v << ", sizeof w " << sizeof w << '\n';
}
Output:
main() v.data() 0x9d76008
My_alloc::allocate() return 0x9d76014
My_alloc::construct(U* 0x9d76014)
My_alloc::construct(U* 0x9d76018)
My_alloc::construct(U* 0x9d7601c)
f() v.data() 0x9d76014, v.size() 3
g() v.data() 0x9d76014, v.size() 3
0 1 2 113 114 115 6 7 8 9
sizeof v 12, sizeof w 12
My_alloc::destroy(U* 0x9d76014)
My_alloc::destroy(U* 0x9d76018)
My_alloc::destroy(U* 0x9d7601c)
deallocate
See it run here

Why std::make_move_iterator works on vector<string> but not on vector<int>

I was expecting that std::make_move_iterator will always move contents, but it seems not.
It looks like it is moving elements in vector<string> but not in vector<int>.
See the below code snippet:
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
void moveIntVector()
{
std::cout << __func__ << std::endl;
std::vector<int> v1;
for (unsigned i = 0; i < 10; ++i) {
v1.push_back(i);
}
std::vector<int> v2(
std::make_move_iterator(v1.begin() + 5),
std::make_move_iterator(v1.end()));
std::cout << "v1 is: ";
for (auto i : v1) {
std::cout << i << " ";
}
std::cout << std::endl;
std::cout << "v2 is: ";
for (auto i : v2) {
std::cout << i << " ";
}
std::cout << std::endl;
}
void moveStringVector()
{
std::cout << __func__ << std::endl;
std::vector<std::string> v1;
for (unsigned i = 0; i < 10; ++i) {
v1.push_back(std::to_string(i));
}
std::vector<std::string> v2(
std::make_move_iterator(v1.begin() + 5),
std::make_move_iterator(v1.end()));
std::cout << "v1 is: ";
for (auto i : v1) {
std::cout << i << " ";
}
std::cout << std::endl;
std::cout << "v2 is: ";
for (auto i : v2) {
std::cout << i << " ";
}
std::cout << std::endl;
}
int main()
{
moveIntVector();
moveStringVector();
return 0;
}
The result is:
moveIntVector
v1 is: 0 1 2 3 4 5 6 7 8 9 # I expect this should be `0 1 2 3 4` as well!
v2 is: 5 6 7 8 9
moveStringVector
v1 is: 0 1 2 3 4
v2 is: 5 6 7 8 9
I'm on Ubuntu 14.04, gcc 4.8.2 and the code is compiled with -std=c++11
Could you explain why std::make_move_iterator have different behaviour on vector<int> and vector<string>? (Or is it a bug?)
The behaviour is expected. A move from both vectors leaves the original v1 with 5 moved-from elements in their second half.
The difference is that when the strings are moved, what is left behind is empty strings. This is because it is a very efficient way to move strings, and leave the moved-from string in a self-consistent state (Technically, they could be left to hold the value "Hello, World, nice move!", but that would incur extra cost). The bottom line is that you don't see those moved-from strings in your output.
In the case of the int vectors, there is no way to move an int that is more efficient than copying it, so they are just copied over.
If you check the sizes of the vectors, you will see the v1 have size 10 in both cases.
Here's a simplified example to illustrate that the moved from strings are left empty:
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
int main()
{
std::vector<std::string> v1{"a", "b", "c", "d", "e"};
std::vector<std::string> v2(std::make_move_iterator(v1.begin()),
std::make_move_iterator(v1.end()));
std::cout << "v1 size " << v1.size() << '\n';
std::cout << "v1: ";
for (const auto& s : v1) std::cout << s << " - ";
std::cout << '\n';
std::cout << "v2 size " << v2.size() << '\n';
std::cout << "v2: ";
for (const auto& s : v2) std::cout << s << " - ";
std::cout << '\n';
}
Output:
v1 size 5
v1: - - - - -
v2 size 5
v2: a - b - c - d - e -
When we talk about a move we are not talking about moving the object itself (it remains intact). What gets moved are its internal data. This may or may not affect the value of the object whose internal data gets moved.
That is why your int array doesn't loose its original ints. As to your string example, it still has the original std::strings just like the int example but their internal values have changed to empty strings.
It is important to remember that internally a std::string (essentially) holds a pointer to a character array. So when you copy a std::string you copy every element of the character array. A move, however, avoids doing all that copying by copying the internal pointer instead.
But if the move operation stopped there that would leave both std::strings pointing at the same character array and changing the character data pointed to by either std::string would also change the other's. So when you move a string it is not enough to merely copy the internal pointer, you have to make the internal pointer of the std::string you moved from point to a new blank character array so that it can no longer affect the string its data was moved to.
When moving an int there is no further action required after the copy of its data. There are no pointers involved so after the copy both ints contain independent data.
move constructor is like of an object works like taking a regular reference and a instruction to move things. the default move constructor tries to call the move constructor of all member variables. a user defined one... pretty much it's up to the programmer to tell it what to do.
you could program your objects to be in a undefined state after being subject to a move constructor, you can keep them unchanged(the destructor will still be called so you need to take care of that), you can keep them valid. strings will have a defined state after being subject to a move constructor.
as for your example...
int is trivially copyable and it's move constructor won't do anything but copying.
string is not trivially copyable. it has some dynamic stuff in it that the move constructor moves. and the previous one is left with a length of zero, you ARE printing them, along with the trailing "space" which you added. it's just they are the last 5 elements, at the end of what your printing and you aren't noticing it because it's equivalent to 5 trailing white spaces.