How to optimize a std::set intersection algorithm (C++)

How to optimize a std::set intersection algorithm (C++) - c++

I'm struggling with a part of my college assignment. I have two subsets of std::set containers containing pointers to quite complex objects, but ordered by different criteria (which is why I can't use std::set_intersection()). I need to find elements that are contained in both subsets as fast as possible. There is a time/complexity requirement on the assignment.
I can do that in n*log(m) time where n is the size of the first subset and m is the size of the second subset by doing the following:
for(auto it = subset1.begin(), it != subset1.end(), it++){
if(find(subset2.begin(), subset2.end(), *it))
result.insert(*it);
}
This fails the time requirement, which says worst case linear, but average better than linear.
I found the following question here and I find the hashtable approach interesting. However, I fear that the creation of the hashtable might incur too much overhead. The class contained in the sets looks something like this:
class containedInSets {
//methods
private:
vector<string> member1;
SomeObject member2;
int member3;
}
I have no control over the SomeObject class, and therefore cannot specify a hash function for it. I'd have to hash the pointer. Furthermore, the vector may grow quite (in the thousands of entries).
What is the quickest way of doing this?

Your code is not O(n log(m)) but O(n * m).
std::find(subset2.begin(), subset2.end(), *it) is linear, but std::set has methods find and count which are in O(log(n)) (they do a binary search).
So you can simply do:
for (const auto& e : subset1) {
if (subset2.count(e) != 0) {
result.insert(e);
}
}
Which has complexity of n*log(m) instead of your n * m.

Related

Confusion about time complexity with hash maps

On leetcode I find it is common to "ignore" the worst-case time complexity involving hash maps. I thought in software interviews that it was standard to assume "worst case" as they often do. Below is my solution to a simple problem. The problem is to find the first non repeating char in a string. I understand that hash maps are on average O(1) lookup.. but when iterating over the string, and looking up the hash map, why is the time complexity not O(N^2) and instead is O(N)?
#include <unordered_map>
class Solution {
public:
unordered_map<char, int> m;
int firstUniqChar(string s) {
for(char c : s) {
m[c]++;
}
for(int i =0; i < s.length(); i++) {
if(m[s[i]] == 1) {
return i;
}
}
return -1;
}
};

It is on average O(N) because hash map is on average O(1) per lookup and you do O(N) of them.
On average means by averaging over all possible inputs. That means there might exists an input array that breaks a particular hash and achieves O(N) or much worse on every lookup.
Worst-case is heavily implementation specific - e.g. hashing into buckets depends on how are elements stored in each bucket. If they are in a simple list, then lookup is O(<duplicates>), binary tree will bring that down to O(log<duplicates>). There might also be a difference between searching for keys present and missing.
Also there is a big assumption that all hashed containers can grow with the number of elements stored. I.e. keeping the occupancy of buckets low.
It does not hurt to mention their worst-cases in interviews, it demonstrates you know they can have limits.

The time-complexity of the given problem is O(N). You may provide a perfect hash function for it, that is no collision ever happens. This perfect hash function here is static_cast<size_t>(256+c). Well, if you look at the fastest solutions to this problem on leetcode you see that guys use plain arrays.

Which container is most efficient for multiple insertions / deletions in C++?

I was set a homework challenge as part of an application process (I was rejected, by the way; I wouldn't be writing this otherwise) in which I was to implement the following functions:
// Store a collection of integers
class IntegerCollection {
public:
// Insert one entry with value x
void Insert(int x);
// Erase one entry with value x, if one exists
void Erase(int x);
// Erase all entries, x, from <= x < to
void Erase(int from, int to);
// Return the count of all entries, x, from <= x < to
size_t Count(int from, int to) const;
The functions were then put through a bunch of tests, most of which were trivial. The final test was the real challenge as it performed 500,000 single insertions, 500,000 calls to count and 500,000 single deletions.
The member variables of IntegerCollection were not specified and so I had to choose how to store the integers. Naturally, an STL container seemed like a good idea and keeping it sorted seemed an easy way to keep things efficient.
Here is my code for the four functions using a vector:
// Previous bit of code shown goes here
private:
std::vector<int> integerCollection;
};
void IntegerCollection::Insert(int x) {
/* using lower_bound to find the right place for x to be inserted
keeps the vector sorted and makes life much easier */
auto it = std::lower_bound(integerCollection.begin(), integerCollection.end(), x);
integerCollection.insert(it, x);
}
void IntegerCollection::Erase(int x) {
// find the location of the first element containing x and delete if it exists
auto it = std::find(integerCollection.begin(), integerCollection.end(), x);
if (it != integerCollection.end()) {
integerCollection.erase(it);
}
}
void IntegerCollection::Erase(int from, int to) {
if (integerCollection.empty()) return;
// lower_bound points to the first element of integerCollection >= from/to
auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);
/* std::vector::erase deletes entries between the two pointers
fromBound (included) and toBound (not indcluded) */
integerCollection.erase(fromBound, toBound);
}
size_t IntegerCollection::Count(int from, int to) const {
if (integerCollection.empty()) return 0;
int count = 0;
// lower_bound points to the first element of integerCollection >= from/to
auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);
// increment pointer until fromBound == toBound (we don't count elements of value = to)
while (fromBound != toBound) {
++count; ++fromBound;
}
return count;
}
The company got back to me saying that they wouldn't be moving forward because my choice of container meant the runtime complexity was too high. I also tried using list and deque and compared the runtime. As I expected, I found that list was dreadful and that vector took the edge over deque. So as far as I was concerned I had made the best of a bad situation, but apparently not!
I would like to know what the correct container to use in this situation is? deque only makes sense if I can guarantee insertion or deletion to the ends of the container and list hogs memory. Is there something else that I'm completely overlooking?

We cannot know what would make the company happy. If they reject std::vector without concise reasoning I wouldn't want to work for them anyway. Moreover, we dont really know the precise requirements. Were you asked to provide one reasonably well performing implementation? Did they expect you to squeeze out the last percent of the provided benchmark by profiling a bunch of different implementations?
The latter is probably too much for a homework challenge as part of an application process. If it is the first you can either
roll your own. It is unlikely that the interface you were given can be implemented more efficiently than one of the std containers does... unless your requirements are so specific that you can write something that performs well under that specific benchmark.
std::vector for data locality. See eg here for Bjarne himself advocating std::vector rather than linked lists.
std::set for ease of implementation. It seems like you want the container sorted and the interface you have to implement fits that of std::set quite well.
Let's compare only isertion and erasure assuming the container needs to stay sorted:
operation std::set std::vector
insert log(N) N
erase log(N) N
Note that the log(N) for the binary_search to find the position to insert/erase in the vector can be neglected compared to the N.
Now you have to consider that the asymptotic complexity listed above completely neglects the non-linearity of memory access. In reality data can be far away in memory (std::set) leading to many cache misses or it can be local as with std::vector. The log(N) only wins for huge N. To get an idea of the difference 500000/log(500000) is roughly 26410 while 1000/log(1000) is only ~100.
I would expect std::vector to outperform std::set for considerably small container sizes, but at some point the log(N) wins over cache. The exact location of this turning point depends on many factors and can only reliably determined by profiling and measuring.

Nobody knows which container is MOST efficient for multiple insertions / deletions. That is like asking what is the most fuel-efficient design for a car engine possible. People are always innovating on the car engines. They make more efficient ones all the time. However, I would recommend a splay tree. The time required for a insertion or deletion is a splay tree is not constant. Some insertions take a long time and some take only a very a short time. However, the average time per insertion/deletion is always guaranteed to be be O(log n), where n is the number of items being stored in the splay tree. logarithmic time is extremely efficient. It should be good enough for your purposes.

The first thing that comes to mind is to hash the integer value so single look ups can be done in constant time.
The integer value can be hashed to compute an index in to an array of bools or bits, used to tell if the integer value is in the container or not.
Counting and and deleting large ranges could be sped up from there, by using multiple hash tables for specific integer ranges.
If you had 0x10000 hash tables, that each stored ints from 0 to 0xFFFF and were using 32 bit integers you could then mask and shift the upper half of the int value and use that as an index to find the correct hash table to insert / delete values from.
IntHashTable containers[0x10000];
u_int32 hashIndex = (u_int32)value / 0x10000;
u_int32int valueInTable = (u_int32)value - (hashIndex * 0x10000);
containers[hashIndex].insert(valueInTable);
Count for example could be implemented as so, if each hash table kept count of the number of elements it contained:
indexStart = startRange / 0x10000;
indexEnd = endRange / 0x10000;
int countTotal = 0;
for (int i = indexStart; i<=indexEnd; ++i) {
countTotal += containers[i].count();
}

Not sure if using sorting really is a requirement for removing the range. It might be based on position. Anyway, here is a link with some hints which STL container to use.
In which scenario do I use a particular STL container?
Just FYI.
Vector maybe a good choice, but it does a lot of re allocation, as you know. I prefer deque instead, as it doesn't require big chunk of memory to allocate all items. For such requirement as you had, list probably fit better.

Basic solution for this problem might be std::map<int, int>
where key is the integer you are storing and value is the number of occurences.
Problem with this is that you can not quickly remove/count ranges. In other words complexity is linear.
For quick count you would need to implement your own complete binary tree where you can know the number of nodes between 2 nodes(upper and lower bound node) because you know the size of tree, and you know how many left and right turns you took to upper and lower bound nodes. Note that we are talking about complete binary tree, in general binary tree you can not make this calculation fast.
For quick range remove I do not know how to make it faster than linear.

STL containers and large amounts of data

I have a large collection of data that is read into memory - temporarily, but necessary for the system.
I have been checking the performance of std::vector as well as std::unordered_map.
For std::vector I used a struct of type:
struct information{
std::string name;
unsigned int offset;
}
For std::unordered_map I used the std::string for the key and the unsigned int offset for the value.
If, let's say, 2 000 000 of these are loaded into memory, I tried the following and got these results:
std::vector:
On random string, never really larger than 32 characters if a reserve was called on the vector.
std::vector<information> vec;
vec.reserve(2500000);
The insertion
vec.push_back({dataName, offset});
is quite fast. Trying to find data is very slow though. The find was implemented like this:
auto it = std::find_if(vec.begin(), vec.end(), [&name](information &info) -> bool {return info.name == name); });
Which makes sense seeing that it is a large vector and the correct struct is found on a name compare. But it was extremely poor performance. The memory used was fine - I assume a part of the memory growth was due to the std::string size resizing.
My question on the vector implementation is: Is there a way to increase the look up time? I know that a vector can be sorted to increase your look-up time, but then you lose time in sorting the vector. Especially on a vector of this size.
std::unordered_map:
The insertion
std::unordered_map<std::string, unsigned int> unordMap;
unordMap.reserve(2500000);
unordMap.emplace(name, offset);
takes a very long time. When reserving space beforehand in an attempt to shorten the insertion time the following happens:
The memory at the end of insertion is a lot more when not calling reserve, without reserve the memory is still a lot more than the vector implementation. The reserve doesn't really improve insertion time.
Of course the look up is very fast. My question about the std::unordered_map is can the insertion time and memory usage be improved?
If neither of these can be done, then my next question will probably be quite obvious. Is there a way to get a result in between these two data structures? What is the best for large amounts of data?

struct information{
std::string name;
unsigned int offset;
information(information const&)=default;
information(information&&)=default;
information(std::string n, unsigned o):name(std::move(n)),offset(o),hash(std::hash<std::string>()(name)) {};
information():information("",0) {};
bool operator<( information const& o ) const {
return tie() < o.tie();
}
std::tuple<std::size_t, std::string const&> tie() const { return std::tie(hash, name); }
private:
std::size_t hash;
};
Use the above structure for your std::vector.
After adding all the data, std::sort it.
To find something matching name do this:
struct information_searcher {
struct helper {
std::tuple<std::size_t, std::string const&> data;
helper( std::string const& o ):data(std::hash<std::string>()(o), o) {};
helper( helper const& o ) = default;
helper( information const& o ):data(o.tie()) {}
bool operator<( helper const& o ) const { return data < o.data; }
};
bool operator()( helper lhs, helper rhs ) const { return lhs < rhs; }
};
information* get_info_by_name( std::string const& name ) {
auto range = std::equal_range( vec.begin(), vec.end(), information_searcher::helper(name), information_searcher{} );
if (range.first == range.second) {
return nullptr;
} else {
return &*range.first;
}
}
which is a nearly zero-overhead lookup.
What we do here is hash the strings (for fast comparison), falling back on std::string comparison if we have a collision.
information_searcher is a class that lets us search the data without having to create an information (which would require a wasteful allocation).
get_info_by_name returns a pointer -- nullptr if not found, and a pointer to the first element with that name otherwise.
Changing information.name is impolite, and makes the hash field incorrect.
This may use moderately more memory than the naive std::vector version.
In general, if your work consists of 'add a bunch of stuff to a table' then 'do a bunch of lookups', your best bet is to build a std::vector, sort it in a fast way, then use equal_range to do the lookups. map and unordered_map are optimized for lots of mixed inserts/deletes/etc.

vector is usually implemented as 'dynamic array' and should be the most memory-efficient.
with good reservation strategy it can have insert of O(1) = fast. Searching is O(n) = very bad.
You can help vector by sorting it (and if you first load then search than I think it would be best - std::sort + std::binary_search).
You can as well implement something like insert-sort using std::lower_bound. insert = O(log n) = good, search = O(log n) = good
map (ordered) can actually do the same work, but may as well be implemented using tree = less memory efficient, access as good as sorted vector (but maybe less reallocation, but in your case, sorted vector is still best)
unordered_map is usually imlemented using hash tables = some memory overhead but fast operations (insertion cannot be that fast as in unsorted vector, but should still be pretty fast). The problen with hashing is that it can be fast and even fastest, but may as well be the worst solution (in extreme conditions). The above structures (sorted vector and map/tree are stable, always behave the same - logaritmic complexity).

The problem with a large vector is the lookup time when you don't know the index of objects you want. One way to improve it is, as you stated, to keep an ordered vector and do binary search on it. That way, lookup time will not be of linear complexity but rather of logarithmic complexity, which saves up quite a lot of time with very large containers. This is the lookup used in std::map (the ordered one). You can do a similar binary search using std::lower_bound or std::equal_range on your std::vector.
The problem with a large unordered map is completely different : this kind of container use a hash function and modulus calculation in order to place the elements according to their keys in a standard array. So when you have a n elements in a std::unordered_map, it is very unlikely that you only need an n-elements-long array, because some indices will not be filled. You will use at least the greatest index produced by the hash-and-modulo. One way to improve memory usage as well as insertion time is to write your own hash function. But it might be hard depending on what kind of strings you are using.

Well, the optimal solution here would be to create std::map, which is logarithmic in complexity both in insertion and in lookup. Although, I don't see any reason why you wouldn't use std::vector. It is pretty fast when using quick sort to sort even 2M elements, especially if you do it once. std::binary_search is then really fast. Think it over if you need to make a lot of lookups between insertions.

count the number of distinct absolute values among the elements of the array

I was asked an interview question to find the number of distinct absolute values among the elements of the array. I came up with the following solution (in C++) but the interviewer was not happy with the code's run time efficiency.
I will appreciate pointers as to how I can improve the run time efficiency of this code?
Also how do I calculate the efficiency of the code below? The for loop executes A.size() times. However I am not sure about the efficiency of STL std::find (In the worse case it could be O(n) so that makes this code O(n²) ?
Code is:
int countAbsoluteDistinct ( const std::vector<int> &A ) {
using namespace std;
list<int> x;
vector<int>::const_iterator it;
for(it = A.begin();it < A.end();it++)
if(find(x.begin(),x.end(),abs(*it)) == x.end())
x.push_back(abs(*it));
return x.size();
}

To propose alternative code to the set code.
Note that we don't want to alter the caller's vector, we take by value. It's better to let the compiler copy for us than make our own. If it's ok to destroy their value we can take by non-const reference.
#include <vector>
#include <algorithm>
#include <iterator>
#include <cstdlib>
using namespace std;
int count_distinct_abs(vector<int> v)
{
transform(v.begin(), v.end(), v.begin(), abs); // O(n) where n = distance(v.end(), v.begin())
sort(v.begin(), v.end()); // Average case O(n log n), worst case O(n^2) (usually implemented as quicksort.
// To guarantee worst case O(n log n) replace with make_heap, then sort_heap.
// Unique will take a sorted range, and move things around to get duplicated
// items to the back and returns an iterator to the end of the unique section of the range
auto unique_end = unique(v.begin(), v.end()); // Again n comparisons
return distance(v.begin(), unique_end); // Constant time for random access iterators (like vector's)
}
The advantage here is that we only allocate/copy once if we decide to take by value, and the rest is all done in-place while still giving you an average complexity of O(n log n) on the size of v.

std::find() is linear (O(n)). I'd use a sorted associative container to handle this, specifically std::set.
#include <vector>
#include <set>
using namespace std;
int distict_abs(const vector<int>& v)
{
std::set<int> distinct_container;
for(auto curr_int = v.begin(), end = v.end(); // no need to call v.end() multiple times
curr_int != end;
++curr_int)
{
// std::set only allows single entries
// since that is what we want, we don't care that this fails
// if the second (or more) of the same value is attempted to
// be inserted.
distinct_container.insert(abs(*curr_int));
}
return distinct_container.size();
}
There is still some runtime penalty with this approach. Using a separate container incurs the cost of dynamic allocations as the container size increases. You could do this in place and not occur this penalty, however with code at this level its sometimes better to be clear and explicit and let the optimizer (in the compiler) do its work.

Yes, this will be O(N2) -- you'll end up with a linear search for each element.
A couple of reasonably obvious alternatives would be to use an std::set or std::unordered_set. If you don't have C++0x, you can replace std::unordered_set with tr1::unordered_set or boost::unordered_set.
Each insertion in an std::set is O(log N), so your overall complexity is O(N log N).
With unordered_set, each insertion has constant (expected) complexity, giving linear complexity overall.

Basically, replace your std::list with a std::set. This gives you O(log(set.size())) searches + O(1) insertions, if you do things properly. Also, for efficiency, it makes sense to cache the result of abs(*it), although this will have only a minimal (negligible) effect. The efficiency of this method is about as good as you can get it, without using a really nice hash (std::set uses bin-trees) or more information about the values in the vector.

Since I was not happy with the previous answer here is mine today. Your intial question does not mention how big your vector is. Suppose your std::vector<> is extremely large and have very few duplicates (why not?). This means that using another container (eg. std::set<>) will basically duplicate your memory consumption. Why would you do that since your goal is simply to count non duplicate.
I like #Flame answer, but I was not really happy with the call to std::unique. You've spent lots of time carefully sorting your vector and then simply discard the sorted array while you could be re-using it afterward.
I could not find anything really elegant in the STD library, so here is my proposal (a mixture of std::transform + std::abs + std::sort, but without touching the sorted array afterward).
// count the number of distinct absolute values among the elements of the sorted container
template<class ForwardIt>
typename std::iterator_traits<ForwardIt>::difference_type
count_unique(ForwardIt first, ForwardIt last)
{
if (first == last)
return 0;
typename std::iterator_traits<ForwardIt>::difference_type
count = 1;
ForwardIt previous = first;
while (++first != last) {
if (!(*previous == *first) ) ++count;
++previous;
}
return count;
}
Bonus point is works with forward iterator:
#include <iostream>
#include <list>
int main()
{
std::list<int> nums {1, 3, 3, 3, 5, 5, 7,8};
std::cout << count_unique( std::begin(nums), std::end(nums) ) << std::endl;
const int array[] = { 0,0,0,1,2,3,3,3,4,4,4,4};
const int n = sizeof array / sizeof * array;
std::cout << count_unique( array, array + n ) << std::endl;
return 0;
}

Two points.
std::list is very bad for search. Each search is O(n).
Use std::set. Insert is logarithmic, it removes duplicate and is sorted. Insert every value O(n log n) then use set::size to find how many values.
EDIT:
To answer part 2 of your question, the C++ standard mandates the worst case for operations on containers and algorithms.
Find: Since you are using the free function version of find which takes iterators, it cannot assume anything about the passed in sequence, it cannot assume that the range is sorted, so it must traverse every item until it finds a match, which is O(n).
If you are using set::find on the other hand, this member find can utilize the structure of the set, and it's performance is required to be O(log N) where N is the size of the set.

To answer your second question first, yes the code is O(n^2) because the complexity of find is O(n).
You have options to improve it. If the range of numbers is low you can just set up a large enough array and increment counts while iterating over the source data. If the range is larger but sparse, you can use a hash table of some sort to do the counting. Both of these options are linear complexity.
Otherwise, I would do one iteration to take the abs value of each item, then sort them, and then you can do the aggregation in a single additional pass. The complexity here is n log(n) for the sort. The other passes don't matter for complexity.

I think a std::map could also be interesting:
int absoluteDistinct(const vector<int> &A)
{
map<int, char> my_map;
for (vector<int>::const_iterator it = A.begin(); it != A.end(); it++)
{
my_map[abs(*it)] = 0;
}
return my_map.size();
}

As #Jerry said, to improve a little on the theme of most of the other answers, instead of using a std::map or std::set you could use a std::unordered_map or std::unordered_set (or the boost equivalent).
This would reduce the runtimes down from O(n lg n) or O(n).
Another possibility, depending on the range of the data given, you might be able to do a variant of a radix sort, though there's nothing in the question that immediately suggests this.

Sort the list with a Radix style sort for O(n)ish efficiency. Compare adjacent values.

The best way is to customize the quicksort algorithm such that when we are partitioning whenever we get two equal element then overwrite the second duplicate with last element in the range and then reduce the range. This will ensure you will not process duplicate elements twice. Also after quick sort is done the range of the element is answer
Complexity is still O(n*Lg-n) BUT this should save atleast two passes over the array.
Also savings are proportional to % of duplicates. Imagine if they twist original questoin with, 'say 90% of the elements are duplicate' ...

One more approach :
Space efficient : Use hash map .
O(logN)*O(n) for insert and just keep the count of number of elements successfully inserted.
Time efficient : Use hash table O(n) for insert and just keep the count of number of elements successfully inserted.

You have nested loops in your code. If you will scan each element over the whole array it will give you O(n^2) time complexity which is not acceptable in most of the scenarios. That was the reason the Merge Sort and Quick sort algorithms came up to save processing cycles and machine efforts. I will suggest you to go through the suggested links and redesign your program.

Difference between two vector<MyType*> A and B

I've got two vector<MyType*> objects called A and B. The MyType class has a field ID and I want to get the MyType* which are in A but not in B. I'm working on a image analysis application and I was hoping to find a fast/optimized solution.

The unordered approach will typically have quadratic complexity unless the data is sorted beforehand (by your ID field), in which case it would be linear and would not require repeated searches through B.
struct CompareId
{
bool operator()(const MyType* a, const MyType* b) const
{
return a>ID < b->ID;
}
};
...
sort(A.begin(), A.end(), CompareId() );
sort(B.begin(), B.end(), CompareId() );
vector<MyType*> C;
set_difference(A.begin(), A.end(), B.begin(), B.end(), back_inserter(C) );
Another solution is to use an ordered container like std::set with CompareId used for the StrictWeakOrdering template argument. I think this would be better if you need to apply a lot of set operations. That has its own overhead (being a tree) but if you really find that to be an efficiency problem, you could implement a fast memory allocator to insert and remove elements super fast (note: only do this if you profile and determine this to be a bottleneck).
Warning: getting into somewhat complicated territory.
There is another solution you can consider which could be very fast if applicable and you never have to worry about sorting data. Basically, make any group of MyType objects which share the same ID store a shared counter (ex: pointer to unsigned int).
This will require creating a map of IDs to counters and require fetching the counter from the map each time a MyType object is created based on its ID. Since you have MyType objects with duplicate IDs, you shouldn't have to insert to the map as often as you create MyType objects (most can probably just fetch an existing counter).
In addition to this, have a global 'traversal' counter which gets incremented whenever it's fetched.
static unsigned int counter = 0;
unsigned int traversal_counter()
{
// make this atomic for multithreaded applications and
// needs to be modified to set all existing ID-associated
// counters to 0 on overflow (see below)
return ++counter;
}
Now let's go back to where you have A and B vectors storing MyType*. To fetch the elements in A that are not in B, we first call traversal_counter(). Assuming it's the first time we call it, that will give us a traversal value of 1.
Now iterate through every MyType* object in B and set the shared counter for each object from 0 to the traversal value, 1.
Now iterate through every MyType* object in A. The ones that have a counter value which doesn't match the current traversal value(1) are the elements in A that are not contained in B.
What happens when you overflow the traversal counter? In this case, we iterate through all the counters stored in the ID map and set them back to zero along with the traversal counter itself. This will only need to occur once in about 4 billion traversals if it's a 32-bit unsigned int.
This is about the fastest solution you can apply to your given problem. It can do any set operation in linear complexity on unsorted data (and always, not just in best-case scenarios like a hash table), but it does introduce some complexity so only consider it if you really need it.

Sort both vectors (std::sort) according to ID and then use std::set_difference. You will need to define a custom comparator to pass to both of these algorithms, for example
struct comp
{
bool operator()(MyType * lhs, MyType * rhs) const
{
return lhs->id < rhs->id;
}
};

First look at the problem. You want "everything in A not in B". That means you're going to have to visit "everything in A". You'll also have to visit everything in B to have knowledge of what is and is not in B. So that suggests there should be an O(n) + O(m) solution, or taking liberty to elide the difference between n and m, O(2n).
Let's consider the std::set_difference approach. Each sort is O(n log n), and set_difference is O(n). So the sort-sort-set_difference approach is O(n + 2n log n). Let's call that O(4n).
Another approach would be to first place the elements of B in a set (or map). Iteration across B to create the set is O(n) plus insertion O(log n) of each element, followed by iteration across A O(n), with a lookup for each element of A (log n), gives a total: O(2n log n). Let's call that O(3n), which is slightly better.
Finally, using an unordered_set (or unordered_map), and assuming we get average case of O(1) insertion and O(1) lookup, we have an approach that is O(2n). A-ha!
The real win here is that unordered_set (or map) is probably the most natural choice to represent your data in the first place, i.e., the proper design yields the optimized implementation. That doesn't always happen, but it's nice when it does!

If B preexists to A, then while populating A, you can bookkeep in a C vector.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js