To find a integer without a pair in a sequence of integers - c++

The problem is to find an integer without it's pair in a sequence of integers. Here's what I wrote so far, to me it looks like it should work but it doesn't. Any help for a noob programmer?
using namespace std;
int lonelyinteger(vector < int > a, int _a_size) {
for (int i = 0; i < _a_size; i++)
{
bool flag = false;
for (int n = i + 1; n < _a_size; n++)
{
if (a.at(i) == a.at(n))
{
flag = true;
break;
}
}
if (flag == false)
{
return a.at(i);
}
}
return 0;
}
For the input 1 1 2 it outputs 1 while it's supposed to 2
for 0 0 1 2 1 it outputs 0 and here it has to be 2

The problem is that your inner loop only checks from the index i and onward for a duplicate. In the case 1 1 2 the first loop encounters a[1] which is 1. After that index, there is no element that is equal to 1, so the function returns 1.
In general, there is a better solution to this problem. Instead of going through the vector twice, you can use a set to keep track of all the elements you have already encountered. For each element, check if the set already contains it. If not, add it to the set. Otherwise, remove it from the set. Anything remaining in the set will be unique within the vector.

All of the answers are good.
Now, assume that the array cannot be sorted, here is a somewhat lazy approach using std::map, but shows what can be done using the various algorithm functions.
#include <map>
#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;
int lonelyinteger(const std::vector<int>& a)
{
typedef std::map<int, int> IntMap;
IntMap theMap;
// build the map
for_each(a.begin(), a.end(), [&](int n){ theMap[n]++; });
// find the first entry with a count of 1
return
find_if(theMap.begin(), theMap.end(),
[](const IntMap::value_type& pr){return pr.second == 1; })->first;
}
int main()
{
std::vector<int> TestVect = { 1, 1, 2 };
cout << lonelyinteger(TestVect);
}
Live example: http://ideone.com/0t89Ni
This code assumes that
the passed in vector is not empty,
the first item found with a count of 1 is the lonely value.
There is at least one "lonely value".
I also changed the signature to take a vector by reference and not send the count (since a vector knows its own size).
The code does not do any hand-coded loops, so that is one source of error removed.
Second, the count of the number of times a number is seen is more or less, done by the map using operator[] to insert new entries, and ++ to increase the count on the entry.
Last, the search for the first entry with only a count of 1 is done with std::find_if, again guaranteeing success (given that the data follows the assumptions made above).
So basically, without really trying hard, a solution can be written using algorithm functions and usage of the std::map associative container.
If your data will consist of multiple (or even no) "lonely" integers, the following changes can be made:
#include <map>
#include <vector>
#include <iostream>
#include <algorithm>
#include <iterator>
using namespace std;
std::vector<int> lonelyinteger(const std::vector<int>& a)
{
std::vector<int> retValue;
typedef std::map<int, int> IntMap;
IntMap theMap;
// build the map
for_each(a.begin(), a.end(), [&](int n){ theMap[n]++; });
// find all entries with a count of 1
for_each(theMap.begin(), theMap.end(),
[&](const IntMap::value_type& pr)
{if (pr.second == 1) retValue.push_back(pr.first); });
// return our answer
return retValue;
}
int main()
{
std::vector<int> TestVect = { 1, 1, 2, 3, 5, 0, 2, 8 };
std::vector<int> ans = lonelyinteger(TestVect);
copy(ans.begin(), ans.end(), ostream_iterator<int>(cout," "));
}
Live example: http://ideone.com/40NY4k
Note that we now retrieve any entries with an item of 1, and store it in a vector that will be returned.

Simple answer might be to just sort the lists and then look for something which has a different value before and after it..
Your problem is that the last item of any given value in the list has no subsequent duplicate values and you are thinking having no subsequent duplicates is the same as having no duplicates (which is false).
If you don't want to remove values your inner look has seen and earlier identified as a duplicate of a "previous" value loop over all values in the inner loop ignoring the match with itself.

Related

STL algorithms for pairwise comparison and tracking max/longest sequence

Consider this fairly easy algorithmic problem:
Given an array of (unsorted) numbers, find the length of the longest sequence of adjacent numbers that are increasing. For example, if we have {1,4,2,3,5}, we expect the result to be 3 since {2,3,5} gives the longest increasing sequence of adjacent/contiguous elements. Note that for non-empty arrays, such as {4,3,2,1}, the minimum result will be 1.
This works:
#include <algorithm>
#include <iostream>
#include <vector>
template <typename T, typename S>
T max_adjacent_length(const std::vector<S> &nums) {
if (nums.size() == 0) {
return 0;
}
T maxLength = 1;
T currLength = 1;
for (size_t i = 0; i < nums.size() - 1; i++) {
if (nums[i + 1] > nums[i]) {
currLength++;
} else {
currLength = 1;
}
maxLength = std::max(maxLength, currLength);
}
return maxLength;
}
int main() {
std::vector<double> nums = {1.2, 4.5, 3.1, 2.7, 5.3};
std::vector<int> ints = {4, 3, 2, 1};
std::cout << max_adjacent_length<int, double>(nums) << "\n"; // 2
std::cout << max_adjacent_length<int, int>(ints) << "\n"; // 1
return 0;
}
As an exercise for myself, I was wondering if there is/are STL algorithm(s) that achieve the same effect, thereby (ideally) avoiding the raw for-loop I have. The motivation behind doing this is to learn more about STL algorithms, and practice using abstracted algorithms to make my code more general and reusable.
Here are my ideas, but they don't quite achieve what I'd like.
std::adjacent_find achieves the pairwise comparisons and can be used to find the index of a non-increasing pair, but doesn't easily facilitate the ability to keep a current and maximum length and compare the two. It could be possible to have those state variables as part of my predicate function, but that seems a bit wrong since ideally you'd like your predicate function to not have any side effects, right?
std::adjacent_difference is interesting. One could use it to construct a vector of the differences between adjacent numbers. Then, starting from the second element, depending on if the difference is positive or negative, we could again track the maximum number of consecutive positive differences seen. This is actually quite close to achieving what we'd like. See the example code below:
#include <numeric>
#include <vector>
template <typename T, typename S> T max_adjacent_length(std::vector<S> &nums) {
if (nums.size() == 0) {
return 0;
}
std::adjacent_difference(nums.begin(), nums.end(), nums.begin());
nums.erase(std::begin(nums)); // keep only differences
T maxLength = 1, currLength = 1;
for (auto n : nums) {
currLength = n > 0 ? (currLength + 1) : 1;
maxLength = std::max(maxLength, currLength);
}
return maxLength;
}
The problem here is that we lose out the const-ness of nums if we want to compute the difference, or we have to sacrifice space and create a copy of nums, which is a no-no given the original solution is O(1) space complexity already.
Is there an idea/solution that I have overlooked that achieves what I want in a succinct and readable manner?
In both your code snippets, you are iterating through a range (in the first version, with an index-based-loop, and in the second with a range-for loop). This is not really the kind of code you should be writing if you want to use the standard algorithms, which work with iterators into the range. Instead of thinking of a range as a collection of elements, if you start thinking in terms of pairs of iterators, choosing the right algorithms becomes easier.
For this problem, here's a reasonable way to write this code:
auto max_adjacent_length = [](auto const & v)
{
long max = 0;
auto begin = v.begin();
while (begin != v.end()) {
auto next = std::is_sorted_until(begin, v.end());
max = std::max(std::distance(begin, next), max);
begin = next;
}
return max;
};
Here's a demo.
Note that you were already on the right track in terms of picking a reasonable algorithm. This could be solved with adjacent_find as well, with just a little more work.

Manipulating array's values in a certain way

So I was asked to write a function that changes array's values in a way that:
All of the values that are the smallest aren't changed
if, let's assume, the smallest number is 2 and there is no 3's and 4's then all 5's are changed for 3's etc.
for example, for an array = [2, 5, 7, 5] we would get [2, 3, 4, 3], which generalizes to getting a minimal value of an array which remains unchanged, and every other minimum (not including the first one) is changed depending on which minimum it is. On our example - 5 is the first minimum (besides 2), so it is 2 (first minimum) + 1 = 3, 7 is 2nd smallest after 2, so it is 2+2(as it is 2nd smallest).
I've come up with something like this:
int fillGaps(int arr[], size_t sz){
int min = *min_element(arr, arr+sz);
int w = 1;
for (int i = 0; i<sz; i++){
if (arr[i] == min) {continue;}
else{
int mini = *min_element(arr+i, arr+sz);
for (int j = 0; j<sz; j++){
if (arr[j] == mini){arr[j] = min+w;}
}
w++;}
}
return arr[sz-1];
}
However it works fine only for the 0th and 1st value, it doesnt affect any further items. Could anyone please help me with that?
I don't quite follow the logic of your function, so can't quite comment on that.
Here's how I interpret what needs to be done. Note that my example implementation is written to be as understandable as possible. There might be ways to make it faster.
Note that I'm also using an std::vector, to make things more readable and C++-like. You really shouldn't be passing raw pointers and sizes, that's super error prone. At the very least bundle them in a struct.
#include <algorithm>
#include <set>
#include <unordered_map>
#include <vector>
int fillGaps (std::vector<int> & data) {
// Make sure we don't have to worry about edge cases in the code below.
if (data.empty()) { return 0; }
/* The minimum number of times we need to loop over the data is two.
* First to check which values are in there, which lets us decide
* what each original value should be replaced with. Second to do the
* actual replacing.
*
* So let's trade some memory for speed and start by creating a lookup table.
* Each entry will map an existing value to its new value. Let's use the
* "define lambda and immediately invoke it" to make the scope of variables
* used to calculate all this as small as possible.
*/
auto const valueMapping = [&data] {
// Use an std::set so we get all unique values in sorted order.
std::set<int> values;
for (int e : data) { values.insert(e); }
std::unordered_map<int, int> result;
result.reserve(values.size());
// Map minimum value to itself, and increase replacement value by one for
// each subsequent value present in the data vector.
int replacement = *values.begin();
for (auto e : values) { result.emplace(e, replacement++); }
return result;
}();
// Now the actual algorithm is trivial: loop over the data and replace each
// element with its replacement value.
for (auto & e : data) { e = valueMapping.at(e); }
return data.back();
}

Implementing partition_unique and stable_partition_unique algorithms

I'm looking for a way to partition a set of ordered elements such that all unique elements occur before their respective duplicates, noting that std::unique is not applicable as duplicate elements are overwritten, I thought of using std::partition. Calling this algorithm partition_unique, I also need the corresponding stable_partition_unique (i.e. like stable_partition).
A basic implementation of partition_unique is:
#include <algorithm>
#include <iterator>
#include <unordered_set>
#include <functional>
template <typename BidirIt, typename BinaryPredicate = std::equal_to<void>>
BidirIt partition_unique(BidirIt first, BidirIt last, BinaryPredicate p = BinaryPredicate {})
{
using ValueTp = typename std::iterator_traits<BidirIt>::value_type;
std::unordered_set<ValueTp, std::hash<ValueTp>, BinaryPredicate> seen {};
seen.reserve(std::distance(first, last));
return std::partition(first, last,
[&p, &seen] (const ValueTp& value) {
return seen.insert(value).second;
});
}
Which can be used like:
#include <vector>
#include <iostream>
int main()
{
std::vector<int> vals {1, 1, 2, 4, 5, 5, 5, 7, 7, 9, 10};
const auto it = partition_unique(std::begin(vals), std::end(vals));
std::cout << "Unique values: ";
std::copy(std::begin(vals), it, std::ostream_iterator<int> {std::cout, " "}); // Unique values: 1 10 2 4 5 9 7
std::cout << '\n' << "Duplicate values: ";
std::copy(it, std::end(vals), std::ostream_iterator<int> {std::cout, " "}); // Duplicate values: 7 5 5 1
}
The corresponding stable_partition_unqiue can be achieved by replacing std::partition with std::stable_partition.
The problem with these approaches is that they unnecessarily buffer all unique values in the std::unordered_set (which also adds a hash function requirement), which shouldn't be required as the elements are sorted. It's not too much work to come up with a better implementation for partition_unique, but an implementation of stable_partition_unique seems considerably more difficult, and I'd rather not implement this myself if possible.
Is there a way to use existing algorithms to achieve optimal partition_unique and stable_ partition_unique algorithms?
Create a queue to hold the duplicates. Then, initialize two indexes, src and dest, starting at index 1, and go through the list. If the current item (list[src]) is equal to the previous item (list[dest-1]), then copy it to the queue. Otherwise, copy it to list[dest] and increment dest.
When you've exhausted the list, copy items from the queue to the tail of the original list.
Something like:
Queue dupQueue
int src = 1
int dest = 1
while (src < list.count)
{
if (list[src] == list[dest-1])
{
// it's a duplicate.
dupQueue.push(list[src])
}
else
{
list[dest] = list[src]
++dest
}
++src
}
while (!dupQueue.IsEmpty)
{
list[dest] = dupQueue.pop()
++dest
}
I know the STL has a queue. Whether it has an algorithm similar to the above, I don't know.

efficient method to select index of vector in c++

In C++, suppose you have a vector with boolean values, and you want to select randomly one index among those corresponding to True values.
What is the most efficient method to use?
Example:
vector<bool> v(4);
v.at(0)=true
v.at(1)=false
v.at(2)=true
v.at(3)=true
You want to select a number among the subset {0,2,3}.
I have so far tried 2 methods:
Stacking indexes in a vector and then selecting among these elements. Extremely slow.
Naive method: randomly select a index until v.at(rnd_sel_index) is True. Considerably faster.
Any suggestions faster than method 2?
Perhaps there's a more efficient approach.
Rather than storing what is there and what is not, perhaps it's better to store only what is not - i.e. a vector containing indices that are free.
the order of this vector can be easily randomised once, and you can then pull items from the back() until it's empty().
When you want to return items to the 'free index pool', simply insert them in a random position in the vector.
You can use the well-known method for selecting an element from a sequence of unknown length.
Example Code:
#include <random>
#include <iostream>
#include <vector>
#include <algorithm>
std::size_t choose_element(const std::vector<bool>& v) {
auto last = v.end();
auto chosen_i = std::find(v.begin(), last, true);
auto i = std::find(std::next(chosen_i), last, true);
double n = 2.0;
static auto random_generator = std::mt19937{std::random_device{}()};
while (i != last) {
if (std::bernoulli_distribution(1.0 / n)(random_generator))
chosen_i = i;
i = std::find(std::next(i), last, true);
++n;
}
return std::distance(v.begin(), chosen_i);
}
int main() {
std::vector<bool> v = {true, true, false, true};
std::vector<int> indexes(v.size());
const double N = 100;
for (int i=0; i<N; ++i)
++indexes[choose_element(v)];
for (auto& index : indexes)
std::cout << std::distance(indexes.data(), &index) << ": " << (index / N) << "\n";
return 0;
}
This has predictable performance and only takes one pass through the data. Of course if you are taking multiple samples from the same vector it may be more efficient to restructure the data to a different format and then draw from that. Also, if nearly all of the elements are true, your method (2) might perform better in the average case.

Offer optimised algorithm to work with timestamped values [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
C#. Need to optimise counting positive and negative values
I need to maximize speed of the following functionality:
a. a value comes in. value has 2 properties - int value and long timestamp in ticks.
b. need to count previously stored values which are younger than 1ms (from the current).
c. need to count negative and positive separately.
d. i only need to know if there are either 10 neg or pos values. i dont need to keep any other knowledge of the values.
me thinks - to implement 2 ring arrays for pos and neg separately, replacing expired with 0 keeping track of pos.neg counts as they come.
any thoughts?
Maintaining 2 buffers to keep the positives separated from the negatives sounds like a pain and inefficient.
You could instead have a single buffer with all the values, and use std::accumulate to count up the positives and negatives. If you start with a collection of all the tuples (each of which has an age and a value), you could begin by sorting the collection according to age, finding the last element that is <= 1 ms old, and then using accumulate from begin() to that point. Here's some code that demonstrates that last bit:
#include <algorithm>
#include <numeric>
#include <iterator>
#include <vector>
#include <string>
#include <ctime>
using namespace std;
struct Counter
{
Counter(unsigned pos=0, unsigned neg=0) : pos_(pos), neg_(neg) {};
unsigned pos_, neg_;
Counter& operator+(int n)
{
if( n < 0 )
++neg_;
else if( n > 0 )
++pos_;
return * this;
}
};
int main()
{
srand((unsigned)time(0));
vector<int> vals;
generate_n(back_inserter(vals), 1000, []()
{
return (rand() / (RAND_MAX/40)) - 20;
});
Counter cnt = accumulate(vals.begin(), vals.end(), Counter());
}
If sorting the collection by age and then searching the sorted results for the last eligible entry sounds too ineficient, you could use for_each_if instead of accumulate and simply iterate over the whole collection once. for_each_if isn't part of the Standard Library, but it's easy enough to write. If you don't want to muck about with writing your own for_each_if that's fine, too. You could simply tweak the accumulator a bit so that it doesn't accumulate elements which are too old:
#include <algorithm>
#include <numeric>
#include <iterator>
#include <vector>
#include <string>
#include <ctime>
using namespace std;
struct Tuple
{
int val_;
unsigned age_;
};
struct Counter
{
Counter(unsigned pos=0, unsigned neg=0) : pos_(pos), neg_(neg) {};
unsigned pos_, neg_;
Counter& operator+(const Tuple& tuple)
{
if( tuple.age_ > 1 )
return * this;
if( tuple.val_ < 0 )
++neg_;
else if( tuple.val_ > 0 )
++pos_;
return * this;
}
};
int main()
{
srand((unsigned)time(0));
vector<Tuple> tuples;
generate_n(back_inserter(tuples), 1000, []() -> Tuple
{
Tuple retval;
retval.val_ = (rand() / (RAND_MAX/40)) - 20;
retval.age_ = (rand() / (RAND_MAX/5));
return retval;
});
Counter cnt = accumulate(tuples.begin(), tuples.end(), Counter());
}
I would store the values in a min-heap keyed by timestamp - so the youngest values are at the top of the heap. The integer value is auxiliary data at each node. You could then implement the counting with a recursive function that traverses the heap. You'd pass the running total of negative and positive back up the recursive call.
It would look something like this, in Python-like pseudocode with types:
def young_pos_and_neg(Time currtime, HeapNode p):
if (p is not None and currtime - p.time < 1):
posleft, negleft = young_pos_and_neg(p.leftChild())
posright, negright = young_pos_and_neg(p.rightChild())
totpos = posleft + posright
totneg = negleft + negright
if (p.intValue < 0):
return totpos, totneg + 1
else:
return totpos + 1, totneg
else:
return 0, 0
If you call this on the heap root before inserting the new value - but with the new value's timestamp as the currtime argument - you will get a count of each. It may not be the fastest possible way, but it's pretty simple and elegant. In C++ you could replace the tuple return value with a struct.