I wrote the following code for constructing std::pair as key to unordered_map. However, I dont know why I am getting all 0's as output of vector. Can someone please suggest as to where am I going wrong?
struct key_hash
{
size_t operator()(const std::pair<unsigned,unsigned>& key) const
{
return uint64_t((key.first << 32) | key.second);
}
};
typedef std::unordered_map<std::pair<unsigned,unsigned>, std::vector<unsigned>, key_hash> MyMap;
int main()
{
MyMap m;
vector<unsigned> t;
t.push_back(4);
t.push_back(5);
m[make_pair(4294967292,4294967291)]=t;
for(vector<unsigned>::iterator i=m[make_pair(4294967292,4294967291)].begin(),j=m[make_pair(2147483645,2147483643)].end();i!=j;++i)
cout<<"vec="<<(*i)<<"\n";
cout<<"vector empty. \n";
}
i and j are iterators to 2 different vector's and they cannot be compared. Using debug iterators might catch this under visual studio.
This code: j=m[make_pair(2147483645,2147483643)].end(); will create a new empty vector since the key is different from the previously used one.
Whem initializing j like this: j=m[make_pair(4294967292,4294967291)].end(); the results are fine:
vec=4
vec=5
vector empty.
You are getting undefined behaviour, since m[make_pair(4294967292,4294967291)] and m[make_pair(2147483645,2147483643)] are probably different objects (unless something very strange with overflow wrapping is happening).
You may have a typo with the literals - just do key = make_pair(...u,...u).
Your hash functor probably has an overflow - unsigned int is probably 32-bit on your system).
Your literals probably exceed the maximum value of signed integers, and are not specified to be unsigned.
try changing your hash function to something along those lines:
struct key_hash
{
size_t operator()(const std::pair<unsigned,unsigned>& key) const
{
uint64_t tmp = key.first;
tmp = tmp << 32;
return uint64_t(tmp | key.second);
}
};
I have also added the one instance of the pair so changed the main to :
MyMap m;
vector<unsigned> t;
t.push_back(4);
t.push_back(5);
auto a = make_pair(4294967292,4294967291);
m[a]=t;
for(vector<unsigned>::iterator i=m[a].begin(),j=m[a].end();i!=j;++i)
cout<<"vec="<<(*i)<<"\n";
cout<<"vector empty. \n";
This gave me the correct output:
vec=4
vec=5
vector empty.
Tidying up the loop to make it clear (and ignoring the warning about a 32-bit left-shift on a 32-bit value...)
What you are doing is this:
const auto& first_vector = m[make_pair(4294967292,4294967291)];
const auto& second_vector = m[make_pair(2147483645,2147483643)];
for(auto iter = begin(first_vector) ;
iter != end(second_vector) ; // <<=== SEE THE PROBLEM?
++iter)
{
// ...
}
Incrementing the iterator of one vector will never yield the end() of a different one so your loop is infinite, until you get a segfault because you've accessed memory that does not belong to you.
Related
I am working to implement a code which was written in MATLAB into C++.
In MATLAB you can slice an Array with another array, like A(B), which results in a new array of the elements of A at the indexes specified by the values of the element in B.
I would like to do a similar thing in C++ using vectors. These vectors are of size 10000-40000 elements of type double.
I want to be able to slice these vectors using another vector of type int containing the indexes to be sliced.
For example, I have a vector v = <1.0, 3.0, 5.0, 2.0, 8.0> and a vector w = <0, 3, 2>. I want to slice v using w such that the outcome of the slice is a new vector (since the old vector must remain unchanged) x = <1.0, 2.0, 5.0>.
I came up with a function to do this:
template<typename T>
std::vector<T> slice(std::vector<T>& v, std::vector<int>& id) {
std::vector<T> tmp;
tmp.reserve(id.size());
for (auto& i : id) {
tmp.emplace_back(v[i]);
}
return tmp;
}
I was wondering if there was potentially a more efficient way to do such a task. Speed is the key here since this slice function will be in a for-loop which has approximately 300000 iterations. I heard the boost library might contain some valid solutions, but I have not had experience yet with it.
I used the chrono library to measure the time it takes to call this slice function, where the vector to be sliced was length 37520 and the vector containing the indexes was size 1550. For a single call of this function, the time elapsed = 0.0004284s. However, over ~300000 for-loop iterations, the total elapsed time was 134s.
Any advice would be much appreicated!
emplace_back has some overhead as it involves some internal accounting inside std::vector. Try this instead:
template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id) {
std::vector<T> tmp;
tmp.resize (id.size ());
size_t n = 0;
for (auto i : id) {
tmp [n++] = v [i];
}
return tmp;
}
Also, I removed an unnecessary dereference in your inner loop.
Edit: I thought about this some more, and inspired by #jack's answer, I think that the inner loop (which is the one that counts) can be optimised further. The idea is to put everything used by the loop in local variables, which gives the compiler the best chance to optimise the code. So try this, see what timings you get. Make sure that you test a Release / optimised build:
template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id) {
size_t id_size = id.size ();
std::vector<T> tmp (id_size);
T *tmp_data = tmp.data ();
const int *id_data = id.data ();
const T* v_data = v.data ();
for (size_t i = 0; i < id_size; ++i) {
tmp_data [i] = v_data [id_data [i]];
}
return tmp;
}
The performance seems a bit slow; are you building with compiler optimizations (eg. g++ main.cpp -O3 or if using an IDE, switching to release mode). This alone sped up computation time around 10x.
If you are using optimizations already, by using basic for loop iteration (for int i = 0; i < id.size(); i++) computation time was sped up around 2-3x on my machine, the idea being, the compiler doesn't have to resolve what type auto refers to, and since basic for loops have been in C++ forever, the compiler is likely to have lots of tricks to speed it up.
template<typename T>
std::vector<T> slice(const std::vector<T>& v, const std::vector<int>& id){
// #Jan Schultke's suggestion
std::vector<T> tmp(id.size ());
size_t n = 0;
for (int i = 0; i < id.size(); i++) {
tmp [n++] = v [i];
}
return tmp;
}
my function looks like this:
bool getPair(std::vector<std::vector<unsigned short>>Cards) {
std::sort(Cards.begin(), Cards.end(), Cardsort);
std::map<unsigned short, int>Counter;
for (int i = 0; i < 6; i++)
Counter[Cards[i][0]];
for (const auto& val : Counter) {
if (val.second == 2)
return true;
}
return false;
}
I'm pretty sure I'm using std::map incorrectly, I basically have the vector setup like so:
{{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}}
where the first number represents value, the second represents card suit. I realize now I should have used an object which may have made this problem less complicated but now I'm trying to use std::map to count how many times the value shows up and if it shows up two times, it show return true (in the vector it would return true at the 3) but I don't think I'm using std::map properly
I want to see if Cards has more than one of the same variable in Cards[i][0], I do not care about duplicates in Cards[i][1]
Tested this and works. Highlighted the fix
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
using namespace std;
bool getPair(std::vector<std::vector<unsigned short>>Cards) {
std::sort(Cards.begin(), Cards.end());
std::map<unsigned short, int>Counter;
for (int i = 0; i < 6; i++)
Counter[Cards[i][0]]++; // ++++++++++++++++++ need to alter the value!
for (const auto& val : Counter) {
if (val.second == 2)
return true;
}
return false;
}
int main() {
// your code goes here
// {{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}}
std::vector<std::vector<unsigned short>> c = {{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}};
std::cout << getPair(c);
return 0;
}
Here´s my suggestion.
Some remarks:
why use two loops? You already have the map entry to check, since you want to increase it, so you can check for doubles aka pairs in the counting loop. No need for a second run. This way it´s much less expensive.
I changed the vector parameter to const&. It´s a very bad idea to pass such a thing by value, at least I can´t see why that could be appropriate in that case
I left out the sorting thingy, can´t see for what end it´s needed, just reinsert it, if necessary. Sorting is very expensive.
you are right in the fact that std:: containers do not need initialization, they are proper initialized, the allocator calls the constructor of new elements, event for e.g. int thats one reason why e.g. int got a default constructor syntax and you can write funny thingies like auto a = int();.
accessing nonexistent keys of a map simply creates them
using a set and counting will definitely not yield better performance
I think the code is pretty easy to read, here you are:
#include <iostream>
#include <vector>
#include <map>
bool getPair(const std::vector<std::vector<unsigned short>>& cards) {
std::map<unsigned short, int> counts;
for(const auto& n : cards) {
if(++counts[n[0]] == 2)
return true;
}
return false;
}
int main()
{
std::vector<std::vector<unsigned short>> cards1 = {{2,0},{3,0},{4,1},{3,0},{4,0},{5,0},{6,0}};
std::vector<std::vector<unsigned short>> cards2 = {{1,0},{2,0},{4,1},{3,0},{5,0},{7,0},{6,0}};
std::cout << getPair(cards1) << "\n";
std::cout << getPair(cards2) << "\n";
return 0;
}
Edit:
Quote of the C++14 Standard regarding access to not existing members of std::map, just for the sake of completeness:
23.4.4.3 map element access [map.access]
T& operator[](const key_type& x);
Effects: If there is no key equivalent to x in the map, inserts value_type(x, T()) into the map.
Requires: key_type shall be CopyInsertable and mapped_type shall be DefaultInsertable into
*this.
Returns: A reference to the mapped_type corresponding to x in *this.
Complexity: Logarithmic.23.4.4.3 map element access
First, you address uninitialized variables in Counter, and then you don't really do anything with it (and why do you run till 6 instead of Cards.size()? Your array has size 7 BTW. Also why there is some kind of sort there? You don't need it.):
std::map<unsigned short, int>Counter;
for (int i = 0; i < 6; i++)
Counter[Cards[i][0]];
They might set the uninitialized variable automatically at 0 or they might not - it depends on the implementation as it is not specified as far as I am aware (in Debug they do set it to 0 but I doubt about the Release version). You'll need to rewrite the code as follows to make it work 100% in all circumstances:
std::map<unsigned short, int> Counter;
for (int i = 0; i < (int)Cards.size(); i++)
{
unsigned short card = Cards[i][0];
auto itr = Counter.find(card);
if(itr == Counter.end())
Counter[card] = 1;
else
itr->second++;
}
I would recommend to use std::set for this task:
std::set<unsigned short> Counter;
for (int i = 0; i < (int)Cards.size(); i++)
{
unsigned short card = Cards[i][0];
if(Counter.count(card)>0)
{
return true;
}
Counter.insert(card);
}
return false;
The following structure :
struct SomeStructure {
uint64_t value;
uint64_t data;
};
bool operator > (const SomeStructure& v1, const SomeStructure& v2) {
return (v1.value > v2.value);
}
bool operator < (const SomeStructure& v1, const SomeStructure& v2) {
return (v1.value < v2.value);
}
bool operator == (const SomeStructure& v1, const SomeStructure& v2) {
return (v1.value == v2.value);
}
Is used in a code similar to the following:
SomeStructure st1, st2;
st1.value = st2.value = 10; // has the same 'value'
st1.data = 20; // but is assigned a different number for the 'data'.
st2.data = 40;
std::set<SomeStructure> TheSet;
TheSet.insert(st1);
TheSet.insert(st2);
Will inserting st2 after st1 replace the value of the element present in the set?
In the above example, because the operators > and < are overloaded to only depend on the member SomeStructure::value, both st2 and st1 are considered to be equal while inserting them into TheSet. But the value of SomeStructure::data is different for both these objects. So will it replace the existing element in TheSet or will it ignore the insertion operation if the element is already present?
Is there a way to explicitly enforce either of these two behaviors?
Will this behavior change with compiler and platform?
Edit 1:
I just tested this out in g++ compiler (with c++11 enabled). It does not replace. So is there a way to explicitly enforce this to replace the existing element?
Edit 2:
Actually, there is no standard way to "enforce" this behavior, but it can be done using a simple hack through. Though This method is not recommended, let me present it here:
This method must be used in place of the member function insert within std::set
template <typename T>
void insert_replace(std::set <T>& theSet, const T& toInsert) {
auto it = theSet.find(toInsert);
if(it != theSet.end())
*((T*)&(*it)) = toInsert;
else
theSet.insert(toInsert);
}
And the above code must be replaced with:
int main() {
SomeStructure st1, st2;
st1.value = st2.value = 10; // has the same 'value'
st1.data = 20; // but is assigned a different number for the 'data'.
st2.data = 40;
std::set<SomeStructure> TheSet;
insert_replace (TheSet, st1);
insert_replace (TheSet, st2);
for(auto ii : TheSet) {
std::cout << ii.data;
}
return (0);
}
This method works fine on my compiler, giving the output : 40, instead of 20. But I think people might say this not a recommended method because, the line *((T*)&(*it)) = toInsert; fools the compiler into thinking that the iterator it isn't a constant (but when it actually is). I believe this is the only way we can force std::set to insert by replacing. Is it fine to use this method in my code? or will it cause problems in the future (even if I document it)?
From the documentation:
the insertion operation checks whether each inserted element is equivalent to an element already in the container, and if so, the element is not inserted
So TheSet.insert(st2); will not insert anything, because st2 is equal to st1, which is already in the set.
If you want to be able to insert both of them, you need to change the comparison functions so they test both value and data, or use std::multiset, which allows duplicate entries.
int *arr = (int*) malloc(100*sizeof(int));
int *arr_copy = (int*) malloc(100*sizeof(int));
srand(123456789L);
for( int i = 0; i < 100; i++) {
arr[i] = rand();
arr_copy[i] = arr[i];
}
// ------ do stuff with arr ------
// reset arr...
std::copy(arr_copy, arr_copy+100, arr);
while compiling this I get this warning for std::copy():
c:\program files (x86)\microsoft visual studio 10.0\vc\include\xutility(2227):
warning C4996: 'std::_Copy_impl': Function call with parameters that may be
unsafe - this call relies on the caller to check that the passed values are
correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See
documentation on how to use Visual C++ 'Checked Iterators'
I know how to disable/ignore the warning, but is there is a simple one liner solution to make a "checked iterator" out of an unchecked pointer? Something like (I know cout is not an unchecked pointer like int*, but just e.g.):
ostream_iterator<int> out(cout," ");
std::copy(arr_copy, arr_copy+numElements, out);
I don't want to write a whole new specialized class my_int_arr_output_iterator : iterator.... But can I use one of the existing iterators?
---edit---
As there are many many questions abt my usage of c-style-arrays and malloc instead of STL containers, let me just say that I'm writing a small program to test different sorting algorithms' performance and memory usage. The code snippet you see above is a specialized (original code is template class with multiple methods, testing one algorithm for different number of elements in arrays of different types) version specific to the problem.
In other words, I do know how to do this using STL containers (vector) and their iterators (vector::begin/end). What I didn't know is what I asked.
Thanks though, hopefully someone else would benefit from the answers if not me.
The direct answer you're looking for is the stdext::checked_array_iterator. This can be used to wrap a pointer and it's length into a MSVC checked_iterator.
std::copy(arr_copy, arr_copy+100, stdext::checked_array_iterator<int*>(arr, 100) );
They also provide a stdext::checked_iterator which can wrap a non-checked container.
This is a "Mother, may I" warning: the code is correct, but the library writer thinks you're not smart enough to handle it. Turn off stupid warnings.
Here's one:
std::vector<int> arr(100);
std::vector<int> arr_copy(100);
srand(123456789L);
for( int i = 0; i < 100; i++) {
arr[i] = rand();
arr_copy[i] = arr[i];
}
//do stuff
std::copy(arr_copy.begin(), arr_copy.end(), arr.begin());
There is a limited portable solution to this problem.
It can be done with help of boost::filter_iterator adapter.
There are two limitations:
The iterator is bidirectional without random access. it++ and it-- work but it+=10 doesn't.
it=end(); int val = *it; is not checked and will assign garbage to val. It is only for the element past the last. Other iterator values will be checked. To workaround this limitation, I would always advance the iterator after using its value. So after consuming the last value it would point to end(). Then it=end()-1; int val1 = *it++; int val2 = *it++; // segfault or failing assert on this line. Ether way the error will not go unnoticed.
The solution:
filter_iterator uses a user defined predicate to control which elements are skipped. We can define our predicate that will not skip elements but it will assert if the iterator is out of range in debug mode. There will be no penalty to performance because when in release mode the predicate will only return true and it will be simplified out by the compiler. Below is the code:
// only header is required
#include "boost/iterator/filter_iterator.hpp"
// ...
const int arr[] = {1, 2, 3, 4, 5};
const int length = sizeof(arr)/sizeof(int);
const int *begin = arr;
const int *end = arr + length;
auto range_check = [begin, end](const int &t)
{
assert(&t >= begin && &t < end );
return true;
};
typedef boost::filter_iterator<decltype(range_check), const int *> CheckedIt;
std::vector<int> buffer;
std::back_insert_iterator<std::vector<int>> target_it(buffer);
std::copy(CheckedIt(range_check, begin, end), CheckedIt(range_check, end, end), target_it);
for(auto c : buffer)
std::cout << c << std::endl;
auto it = CheckedIt(range_check, begin, end);
it--; // assertion fails
auto it_end = CheckedIt(range_check, end-1, end);
it ++;
std::cout << *it; // garbage out
it ++; // assertion fails.
For portability you could use
template <class T>
T* cloneArray(T *a, int length) {
T *b = new T[length];
for (int i = 0; i < length; i++) b[i] = a[i];
return b;
}
You can tweak it to change the behaviour to copy one array to another.
I work with a lot of calculation code written in c++ with high-performance and low memory overhead in mind. It uses STL containers (mostly std::vector) a lot, and iterates over that containers almost in every single function.
The iterating code looks like this:
for (int i = 0; i < things.size(); ++i)
{
// ...
}
But it produces the signed/unsigned mismatch warning (C4018 in Visual Studio).
Replacing int with some unsigned type is a problem because we frequently use OpenMP pragmas, and it requires the counter to be int.
I'm about to suppress the (hundreds of) warnings, but I'm afraid I've missed some elegant solution to the problem.
On iterators. I think iterators are great when applied in appropriate places. The code I'm working with will never change random-access containers into std::list or something (so iterating with int i is already container agnostic), and will always need the current index. And all the additional code you need to type (iterator itself and the index) just complicates matters and obfuscates the simplicity of the underlying code.
It's all in your things.size() type. It isn't int, but size_t (it exists in C++, not in C) which equals to some "usual" unsigned type, i.e. unsigned int for x86_32.
Operator "less" (<) cannot be applied to two operands of different sign. There's just no such opcodes, and standard doesn't specify, whether compiler can make implicit sign conversion. So it just treats signed number as unsigned and emits that warning.
It would be correct to write it like
for (size_t i = 0; i < things.size(); ++i) { /**/ }
or even faster
for (size_t i = 0, ilen = things.size(); i < ilen; ++i) { /**/ }
Ideally, I would use a construct like this instead:
for (std::vector<your_type>::const_iterator i = things.begin(); i != things.end(); ++i)
{
// if you ever need the distance, you may call std::distance
// it won't cause any overhead because the compiler will likely optimize the call
size_t distance = std::distance(things.begin(), i);
}
This a has the neat advantage that your code suddenly becomes container agnostic.
And regarding your problem, if some library you use requires you to use int where an unsigned int would better fit, their API is messy. Anyway, if you are sure that those int are always positive, you may just do:
int int_distance = static_cast<int>(distance);
Which will specify clearly your intent to the compiler: it won't bug you with warnings anymore.
If you can't/won't use iterators and if you can't/won't use std::size_t for the loop index, make a .size() to int conversion function that documents the assumption and does the conversion explicitly to silence the compiler warning.
#include <cassert>
#include <cstddef>
#include <limits>
// When using int loop indexes, use size_as_int(container) instead of
// container.size() in order to document the inherent assumption that the size
// of the container can be represented by an int.
template <typename ContainerType>
/* constexpr */ int size_as_int(const ContainerType &c) {
const auto size = c.size(); // if no auto, use `typename ContainerType::size_type`
assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
return static_cast<int>(size);
}
Then you write your loops like this:
for (int i = 0; i < size_as_int(things); ++i) { ... }
The instantiation of this function template will almost certainly be inlined. In debug builds, the assumption will be checked. In release builds, it won't be and the code will be as fast as if you called size() directly. Neither version will produce a compiler warning, and it's only a slight modification to the idiomatic loop.
If you want to catch assumption failures in the release version as well, you can replace the assertion with an if statement that throws something like std::out_of_range("container size exceeds range of int").
Note that this solves both the signed/unsigned comparison as well as the potential sizeof(int) != sizeof(Container::size_type) problem. You can leave all your warnings enabled and use them to catch real bugs in other parts of your code.
You can use:
size_t type, to remove warning messages
iterators + distance (like are first hint)
only iterators
function object
For example:
// simple class who output his value
class ConsoleOutput
{
public:
ConsoleOutput(int value):m_value(value) { }
int Value() const { return m_value; }
private:
int m_value;
};
// functional object
class Predicat
{
public:
void operator()(ConsoleOutput const& item)
{
std::cout << item.Value() << std::endl;
}
};
void main()
{
// fill list
std::vector<ConsoleOutput> list;
list.push_back(ConsoleOutput(1));
list.push_back(ConsoleOutput(8));
// 1) using size_t
for (size_t i = 0; i < list.size(); ++i)
{
std::cout << list.at(i).Value() << std::endl;
}
// 2) iterators + distance, for std::distance only non const iterators
std::vector<ConsoleOutput>::iterator itDistance = list.begin(), endDistance = list.end();
for ( ; itDistance != endDistance; ++itDistance)
{
// int or size_t
int const position = static_cast<int>(std::distance(list.begin(), itDistance));
std::cout << list.at(position).Value() << std::endl;
}
// 3) iterators
std::vector<ConsoleOutput>::const_iterator it = list.begin(), end = list.end();
for ( ; it != end; ++it)
{
std::cout << (*it).Value() << std::endl;
}
// 4) functional objects
std::for_each(list.begin(), list.end(), Predicat());
}
C++20 has now std::cmp_less
In c++20, we have the standard constexpr functions
std::cmp_equal
std::cmp_not_equal
std::cmp_less
std::cmp_greater
std::cmp_less_equal
std::cmp_greater_equal
added in the <utility> header, exactly for this kind of scenarios.
Compare the values of two integers t and u. Unlike builtin comparison operators, negative signed integers always compare less than (and not equal to) unsigned integers: the comparison is safe against lossy integer conversion.
That means, if (due to some wired reasons) one must use the i as integer, the loops, and needs to compare with the unsigned integer, that can be done:
#include <utility> // std::cmp_less
for (int i = 0; std::cmp_less(i, things.size()); ++i)
{
// ...
}
This also covers the case, if we mistakenly static_cast the -1 (i.e. int)to unsigned int. That means, the following will not give you an error:
static_assert(1u < -1);
But the usage of std::cmp_less will
static_assert(std::cmp_less(1u, -1)); // error
I can also propose following solution for C++11.
for (auto p = 0U; p < sys.size(); p++) {
}
(C++ is not smart enough for auto p = 0, so I have to put p = 0U....)
I will give you a better idea
for(decltype(things.size()) i = 0; i < things.size(); i++){
//...
}
decltype is
Inspects the declared type of an entity or the type and value category
of an expression.
So, It deduces type of things.size() and i will be a type as same as things.size(). So,
i < things.size() will be executed without any warning
I had a similar problem. Using size_t was not working. I tried the other one which worked for me. (as below)
for(int i = things.size()-1;i>=0;i--)
{
//...
}
I would just do
int pnSize = primeNumber.size();
for (int i = 0; i < pnSize; i++)
cout << primeNumber[i] << ' ';