Sort numbers in digit order - c++

I know that when the numbers are input as strings and sorted normally, I would get the correct output. But can someone explain how/why this is happening?
Example test case:
Input:
100 1 10 2 21 20
Output:
1 10 100 2 20 21

Digits, when considered as characters, are handled the same as alphabetical characters. (i.e. they have a relative lexicographical order, which is the same as their ordering based on ascending value)
In effect, when you're handling integers as string,s you may consider digits 0,1,2,3,4,5,6,7,8,9 to be letters 'a','b','c','d','e','f','g','h','i','j'. Consequently, sorting the input can be thought of as sorting strings.
Your original input;
100 1 10 2 21 20
can then be considered as;
baa b ba c cb ca
for which the appropriate ordering would be;
b ba baa c ca cb
If you switch back the values, you may see that this is the output you provided in your question, namely;
1 10 100 2 20 21

That is just how default lexicographical string comparison works.
What you are probably looking for is called natural sort:
Natural order means sorting strings so that embedded numbers are treated as numbers. This means that if you use natural order for sorting you get this:
1 one
2 two
3 three
10 ten
Instead of the default sort behaviour:
1 one
10 ten
2 two
3 three

Reading a list of numbers, sorting them and then printing them out is extremely easy in C++ once you know the "secret".
The "secret" is to use the functionality that exist in the standard library. In this case (reading numbers, sorting them, printing them out) you need to know about std::vector, std::istream_iterator, std::sort, std::copy and std::ostream_iterator.
Then you could do something like
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
int main()
{
// Create a vector containing numbers read from std::cin
std::vector<int> number(std::istream_iterator<int>(std::cin),
std::istream_iterator<int>());
// Sort the numbers
std::sort(begin(numbers), end(numbers));
// Print all the numbers in the vector to std::cout
std::copy(begin(numbers), end(numbers),
std::ostream_iterator<int>(std::cout, " "));
}

Related

How to select the nth position from a string vector?

How does one find, let's say, the 2nd position from a string vector?
Here's a string vector example:
1 2 3 4 Hi
7 8 9 0 Bye
2 2 5 6 World
If I use example.at(2), it gives me the whole row 2 2 5 6 World.
I just want to get 2 from the 1st row instead of getting the whole line of 2 2 5 6 World. How do I do that?
The return value of example.at(2) is the 3rd item in your vector, in this case, a std::string.
To access a specific character in your string, you can use the operator[]. So to select 2 from the first row, you would simply need to do the following:
example.at(0)[2];
So what you actually have is vector of string where string represents another dimension, so you have an table with both rows and columns, similar to an 2D array, in other to access a single cell you need 2 indexes, one index for position in vector and another for position in string.
So in your case it would be example[0][0] to get first char of a string in first row, and to get one you are looking for you would need to write example.at(0)[2];
This should work:
#include <iostream>
#include <string>
#include <vector>
int main() {
std::vector<std::string> strings;
strings.push_back("1234Hi");
strings.push_back("7890Bye");
std::cout << strings.at(0)[1] << std::endl; // prints 2
std::cout << strings.at(1)[1] << std::endl; // prints 8
}
It's sort of like a two-dimensional array: each string you push to the vector is analogous to the first dimension, and then each character of the string is analogous to the second dimension.
But as mentioned above, there may be better ways to do this, depending on what exactly you're trying to do.
Other answers show you how to access individual numbers in your strings, but they assume that the numbers are always 1 digit in length. If you ever need to support multi-digit numbers, use std::istringstream() or std::stoi() instead to parse the strings.

Set_Intersection with repeated values

I think the set_intersection STL function described here: http://www.cplusplus.com/reference/algorithm/set_intersection/
is not really a set intersection in the mathematical sense. Suppose that the examples given I change the lines:
int first[] = {5,10,15,20,20,25};
int second[] = {50,40,30,20,10,20};
I would like to get 10 20 20 as a result. But I only get unique answers.
Is there a true set intersection in STL?
I know it's possible with a combination of merges and set_differences, btw. Just checking if I'm missing something obvious.
I would like to get 10 20 20 as a result. But I only get unique answers. Is there a true set intersection in STL?
std::set_intersection works how you want.
You probably get the wrong answer because you didn't update the code properly. If you change the sets to have 6 elements you need to update the lines that sort them:
std::sort (first,first+5); // should be first+6
std::sort (second,second+5); // should be second+6
And also change the call to set_intersection to use first+6 and second+6. Otherwise you only sort the first 5 elements of each set, and only get the intersection of the first 5 elements.
Obviously if you don't include the repeated value in the input, it won't be in the output. If you change the code correctly to include all the input values it will work as you want (live example).
cplusplus.com is not a good reference, if you look at http://en.cppreference.com/w/cpp/algorithm/set_intersection you will see it clearly states the behaviour for repeated elements:
If some element is found m times in [first1, last1) and n times in [first2, last2), the first std::min(m, n) elements will be copied from the first range to the destination range.
Even the example at cplusplus.com is bad, it would be simpler, and harder to introduce your bug, if it was written in idiomatic modern C++:
#include <iostream> // std::cout
#include <algorithm> // std::set_intersection, std::sort
#include <vector> // std::vector
int main () {
int first[] = {5,10,15,20,20,25};
int second[] = {50,40,30,20,10,20};
std::sort(std::begin(first), std::end(first));
std::sort(std::begin(second), std::end(second));
std::vector<int> v;
std::set_intersection(std::begin(first), std::end(first),
std::begin(second), std::end(second),
std::back_inserter(v));
std::cout << "The intersection has " << v.size() << " elements:\n";
for (auto i : v)
std::cout << ' ' << i;
std::cout << '\n';
}
This automatically handles the right number of elements, without ever having to explicitly say 5 or 6 or any other magic number, and without having to create initial elements in the output vector and then resize it to remove them again.
set_intersection requires both ranges to be sorted. In the data you've given, second is not sorted.
If you sort it first, you should get your expected answer.

Recursion vs bitmasking for getting all combinations of vector elements

While practicing for programming competitions (like ACM, Code Jam, etc) I've met some problems that require me to generate all possible combinations of some vector elements.
Let's say that I have the vector {1,2,3}, I'd need to generate the following combinations (order is not important) :
1
2
3
1 2
1 3
2 3
1 2 3
So far I've done it with the following code :
void getCombinations(int a)
{
printCombination();
for(int j=a;j<vec.size();j++)
{
combination.pb(vec.at(j));
getCombinations(j+1);
combination.pop_back();
}
}
Calling getCombinations(0); does the job for me. But is there a better (faster) way? I've recently heard of bitmasking. As I understood it's simply for all numbers between 1 and 2^N-1 I turn that number into a binary where the 1s and 0s would represent whether or not that element is included in the combinations.
How do I implement this efficiently though? If I turn every number into binary the standard way (by dividing with 2 all the time) and then check all the digits, it seems to waste a lot of time. Is there any faster way? Should I keep on using the recursion (unless I run into some big numbers where recursion can't do the job (stack limit))?
The number of combinations you can get is 2^n, where n is the number of your elements. You can interpret every integer from 0 to 2^n -1 as a mask. In your example (elements 1, 2, 3) you have 3 elements and the masks would therefore be 000, 001, 010, 011, 100, 101, 110, and 111. Let every place in the mask represent one of your elements. For place that has a 1, take the corresponding element, otherwise if the place contains a 0, leave the element out. For example the the number 5 would be the mask 101 and it would generate this combination: 1, 3.
If you want to have a fast and relatively short code for it, you could do it like this:
#include <cstdio>
#include <vector>
using namespace std;
int main(){
vector<int> elements;
elements.push_back(1);
elements.push_back(2);
elements.push_back(3);
// 1<<n is essentially pow(2, n), but much faster and only for integers
// the iterator i will be our mask i.e. its binary form will tell us which elements to use and which not
for (int i=0;i<(1<<elements.size());++i){
printf("Combination #%d:", i+1);
for (int j=0;j<elements.size();++j){
// 1<<j shifts the 1 for j places and then we check j-th binary digit of i
if (i&(1<<j)){
printf(" %d", elements[j]);
}
}
printf("\n");
}
return 0;
}

Output wrong Project Euler 50

So I am attempting Problem 50 of project euler. (So close to level 2 :D) It goes like this:
The prime 41, can be written as the sum of six consecutive primes:
41 = 2 + 3 + 5 + 7 + 11 + 13
This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
Here is my code:
#include <iostream>
#include <vector>
using namespace std;
int main(){
vector<int> primes(1000000,true);
primes[0]=false;
primes[1]=false;
for (int n=4;n<1000000;n+=2)
primes[n]=false;
for (int n=3;n<1000000;n+=2){
if (primes[n]==true){
for (int b=n*2;b<100000;b+=n)
primes[b]=false;
}
}
int basicmax,basiccount=1,currentcount,biggermax,biggercount=1,sum=0,basicstart,basicend,biggerstart,biggerend;
int limit=1000000;
for (int start=2;start<limit;start++){
//cout<<start;
sum=0;
currentcount=0;
for (int basic=start;start<limit&&sum+basic<limit;basic++){
if (primes[basic]==true){
//cout<<basic<<endl;
sum+=basic;currentcount++;}
if (primes[sum]&&currentcount>basiccount&&sum<limit)
{basicmax=sum;basiccount=currentcount;basicstart=start;basicend=basic;}
}
if (basiccount>biggercount)
{biggercount=basiccount;biggermax=basicmax;biggerend=basicend;biggerstart=basicstart;}
}
cout<<biggercount<<endl<<biggermax<<endl;
return 0;
}
Basically it just creates a vector of all primes up to 1000000 and then loops through them finding the right answer. The answer is 997651 and the count is supposed to be 543 but my program outputs 997661 and 546 respectively. What might be wrong?
It looks like you're building your primes vector wrong
for (int b=n*2;b<100000;b+=n)
primes[b]=false;
I think that should be 1,000,000 not 100,000. It might be better to refactor that number out as a constant to make sure it's consistent throughout.
The rest of it looks basically fine, although without testing it ourselves I'm not sure what else we can add. There's plenty of room for efficiency improvements: you do do a lot of repeated scanning of ranges e.g. there's no point starting to sum when prime[start] is false, you could build a second vector of just the primes for the summing etc. (Does project Euler have runtime and memory limit restrictions? I can't remember)
You are thinking about this the wrong way.
Generate the maximal sequence of primes such that their sum is less than 1,000,000. This is 2, 3, 5, ..., p. For some p.
Sum this sequence and test it for primality.
If it is prime terminate and return the sum.
A shorter sequence must be the correct one. There are exactly two ways of shortening the sequence and preserving the consecutive prime property - removing the first element or removing the last. Recurse from 2 with both of these sequences.

set intersection

I want to calculate the gcd of two numbers m and n by prime factorization and taking common factors
like this. Take example m = 36 n = 48
vector<int> factors1 = prime_factorization(m); // 2 2 3 3
vector<int> factors2 = prime_factorization(n); // 2 2 2 2 3
vector<int> intersection(10);
set_intersection(factors1.begin(), factors1.end(), factors2.begin(), factors2.end(), intersection.begin());
intersection is now 2 2 3 0 0 0 0 0 0 0. For this i must set the size of the vector beforehand. Also the remaining elements are set to 0. I don't want this to happen.
Is there a better way to do this? Using sets or anything else?
Also, how do i calculate the product of elements in the vector intersection (2*2*3) using stl ignoring the zeroes?
You can use a back-inserter:
vector<int> intersection;
set_intersection(..., back_inserter(intersection));
Note that there are much better ways of determining the GCD, such as Euclid's algorithm.
Oli's answer is best in the situation as you describe it. But if you were using a vector that already existed and had elements that you were writing over, and you wanted to chop off the extra numbers, you can do it a different way. By calling the vector member erase using the return value of set_intersection:
intersection.erase(
set_intersection(factors1.begin(), factors1.end(), factors2.begin(), factors2.end(), intersection.begin()),
intersection.end());