Improving a solution - c++

The description of a task goes like this:
We have n numbers, and we have to find quantity of unique sums of all the pairs in the array.
For example:
3 2 5 6 3
The sums of all the pairs(non-repeated) are 5 9 8 6 8 7 5 11 9 8
Unique are 5 9 8 6 7 11
Therefore output is 6
I have come up with this really primitive, and time-consuming (meaning complexity) solution:
int n = 0;
cin >> n;
vector<int> vec(n);
for (int i = 0; i < n; i++)
{
cin >> vec[i];
}
vector<int> sum;
for (int i = 0; i < n; i++)
{
for (int j = i+1; j < n; j++)
{
sum.push_back(vec[i] + vec[j]);
}
}
sort(sum.begin(), sum.end());
for (int i = 0; i < sum.size()-1;)
{
if (sum[i] == sum[i + 1]) sum.erase(sum.begin() + i);
else i++;
}
cout << endl << sum.size();
I feel like there could be a solution using Combinatorics or something easier. I have thought a lot and couldn't think of anything. So my request is if anyone can improve the solution.

As mentioned above what you need it is difficult to do this without computing the sum of all pairs, so I am not going to handle that, I am just going to advise about efficient data structures.
Analysis of your solution
Your code adds everything in advance O(n^2) then sorts O(n^2 log(n)), then remove duplicates. But since you are erasing from a vector, that ultimately has complexity linear with the number of elements to the end of the list. It means that the second loop will make the complexity of your algorithm O(n^4).
You can count the unique elements in a sorted array without removing
int count = 0;
for (int i = 0; i < sum.size()-1; ++i)
{
if (sum[i] != sum[i + 1]) ++count
}
This change alone makes your algorithm complexity O(n^2 log n).
Alternatives without sorting.
Here are alternatives that O(n^2) and storage depending on the range of the input values instead of the length of the vector (except for the last).
I am testing with 1000 elements smaller between 0 and 10000
vector<int> vec;
for(int i = 0; i < 1000; ++i){
vec.push_back(rand() % 10000);
}
Your implementation sum_pairs1(vec) (18 seconds)
int sum_pairs1(const vector<int> &vec){
vector<int> sum;
int n = vec.size();
for (int i = 0; i < n; i++)
{
for (int j = i+1; j < n; j++)
{
sum.push_back(vec[i] + vec[j]);
}
}
sort(sum.begin(), sum.end());
for (int i = 0; i < sum.size()-1;)
{
if (sum[i] == sum[i + 1]) sum.erase(sum.begin() + i);
else i++;
}
return sum.size();
}
If you know the range for the sum of the values you can use a bitset, efficient use of memory sum_pairs2<20000>(vec) (0.016 second).
template<size_t N>
int sum_pairs2(const vector<int> &vec){
bitset<N> seen;
int n = vec.size();
for (int i = 0; i < n; i++)
{
for (int j = i+1; j < n; j++)
{
seen[vec[i] + vec[j]] = true;
}
}
return seen.count();
}
If you know that the maximum sum is not so high (the vector is not very sparse), but you don't know at compilation time you can use a vector, you can keep track of minimum and maximum to allocate the minimum possible and also supporting negative values.
int sum_pairs2b(const vector<int> &vec){
int VMAX = vec[0];
int VMIN = vec[0]
for(auto v : vec){
if(VMAX < v) VMAX = v;
else if(VMIN > v) VMIN = v;
}
vector<bool> seen(2*(VMAX - VMIN) + 1);
int n = vec.size();
for (int i = 0; i < n; i++)
{
for (int j = i+1; j < n; j++)
{
seen[vec[i] + vec[j] - 2*VMIN] = true;
}
}
int count = 0;
for(auto c : seen){
if(c) ++count;
}
return count;
}
And If you want a more general solution that works well with sparse data sum_pairs3<int>(vec) (0.097 second)
template<typename T>
int sum_pairs3(const vector<T> &vec){
unordered_set<T> seen;
int n = vec.size();
for (int i = 0; i < n; i++)
{
for (int j = i+1; j < n; j++)
{
seen.insert(vec[i] + vec[j]);
}
}
return seen.size();
}

Related

Displaying first element user input in an Array

We can define the term 'value of a name' as the average position of
the letters in the name, calculating 'A' as 1, 'B' as 2, 'C' as 3, and
so on. The value of "BOB" would be (2 + 15 + 2)/ 3 = 6. According to
this value, the names will be arranged from the smallest towards the
biggest in the output. When two or more names have the same value,
the name which is in the first position in the original list (the
first one the user inputs) should show up first in the sorted list
(the output).
Input In the first line we have an integer N (1 <= N <= 100), which is
the number of names. In every of the N lines we have one name ([A-Z],
no empty spaces). Names contain 1 - 200 letters.
Output Print out the sorted list (one name in a line).
Test-case
Input: 3 BOB AAAAAAA TOM Output: AAAAAAA BOB TOM
I tried something, and the code seemed to work, I just had a problem with the output. I couldn't find a way to arrange the names with the same value, according to their position in the original list. Here's the other test-case I tried, but didn't figure out:
Input:
10
COSOPYILSPKNKZSTUZVMEERQDL
RRPPNG
PQUPOGTJETGXDQDEMGPNMJEBI
TQJZMOLQ
BKNGFEJZWMJNJLSTUBHCFHXWMYUPZM
YNWEPZKNBOOXNZVWKIUS
LV
CJDFYDMYZVOEW
TMHEJLIDEHT
KGTGFIFWYTKPWTYQQPGKRRYFXN
Output:
TMHEJLIDEHT
PQUPOGTJETGXDQDEMGPNMJEBI
BKNGFEJZWMJNJLSTUBHCFHXWMYUPZM
CJDFYDMYZVOEW
RRPPNG
COSOPYILSPKNKZSTUZVMEERQDL
KGTGFIFWYTKPWTYQQPGKRRYFXN
TQJZMOLQ
YNWEPZKNBOOXNZVWKIUS
LV
My output:
TMHEJLIDEHT
PQUPOGTJETGXDQDEMGPNMJEBI
CJDFYDMYZVOEW // these two
BKNGFEJZWMJNJLSTUBHCFHXWMYUPZM // should be arranged with their places switched
RRPPNG
COSOPYILSPKNKZSTUZVMEERQDL
KGTGFIFWYTKPWTYQQPGKRRYFXN
TQJZMOLQ
YNWEPZKNBOOXNZVWKIUS
LV
#include <iostream>
#include <string>
using namespace std;
int main() {
int N;
cin >> N;
string words[N];
int res[N];
for (int i = 0; i < N; i++) {
int sum = 0;
int value = 0;
int temp = 0;
string word;
cin >> words[i];
word = words[i];
for (int j = 0; j < word.length(); j++) {
sum += (int)word[j] - 64;
}
value = sum / word.length();
res[i] = value;
}
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (res[i] < res[j]) {
swap(res[i], res[j]);
swap(words[i], words[j]);
}
}
}
for (int i = 0; i < N; i++) {
cout << words[i] << endl;
}
return 0;
}
string words[N];
int res[N];
This here is not valid C++, you can not size a stack array using a runtime variable, although some compilers might support such a feature. You might use say std::vector instead, which behaves much like an array.
vector<string> words;
vector<int> res;
for (int i = 0; i < N; i++) {
int sum = 0;
int value = 0;
int temp = 0;
string word;
cin >> word;
words.push_back(word);
for (int j = 0; j < word.length(); j++) {
sum += (int)word[j] - 64;
}
value = sum / word.length();
res.push_back(value);
}
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (res[i] < res[j]) {
swap(res[i], res[j]);
swap(words[i], words[j]);
}
}
}
The ordering is because your sorting algorithm is not stable. Stable means that items with equal values will maintain the same order relative to each other.
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
if (res[i] < res[j]) {
swap(res[i], res[j]);
swap(words[i], words[j]);
}
}
}
What you have is very close to bubble sort, which is stable.
for (int i = 0; i < N; i++) {
for (int j = 0; j < N - i - 1; j++) { // i elements sorted so far
if (res[j] > res[j + 1]) {
swap(res[j], res[j + 1]);
swap(words[j], words[j + 1]);
}
}
}
C++ also provides a stable sort in <algorithm>, but it can't function directly on two arrays like this unfortunately, one option is to compute the value on the fly, another could be to make a class holding both items and sort that, or another to sort the indices.
std::stable_sort(words.begin(), words.end(), [&](auto &a, auto &b)
{
int suma = 0, sumb = 0; // better yet, make a "int value(const string &str)" function.
for (int j = 0; j < a.length(); j++) {
suma += (int)a[j] - 64;
}
for (int j = 0; j < b.length(); j++) {
sumb += (int)b[j] - 64;
}
int valuea = suma / a.length();
int valueb = sumb / b.length();
return valuea < valueb;
});
A class containing both items is pretty straight forward, for indices, make a 3rd array and sort that.
vector<size_t> indices;
...
string word;
cin >> word;
indices.push_back(words.size());
words.push_back(word);
...
std::stable_sort(indices.begin(), indices.end(), [&](auto a, auto b){ return res[a] < res[n]; });
for (int i = 0; i < N; i++) {
cout << words[indices[i]] << endl;
}
A possible solution could be order the result array during construction.
When you add the words in the result array, use the result obtained to add the word in the right place. In this way you can check if exist already the same value and add the new word after the previous with the same value.
After reading the next word use insertion sort (wiki) which is stable
read word
calculate value
insert in a right place in the array
go to 1 until i < N otherwise print out
Doesn't require additional sorting procedure.
in python:
def sort_list(list1, list2):
zipped_pairs = zip(list2, list1)
z = [x for _, x in sorted(zipped_pairs)]
return z
times = int(input())
entries = []
ordered = []
for x in range(times):
entries.append(input())
for x in entries:
chars = []
for y in x:
chars.append(ord(y) - 96)
ordered.append(sum(chars))
print(sort_list(entries,ordered))
If you use a std::multimap<int, std::string>, there would be no need to sort, as the key would already serve as the sorting criteria.
Here is a solution using std::multimap:
#include <string>
#include <numeric>
#include <iostream>
#include <sstream>
#include <map>
// Test data
std::string test = "10\n"
"COSOPYILSPKNKZSTUZVMEERQDL\n"
"RRPPNG\n"
"PQUPOGTJETGXDQDEMGPNMJEBI\n"
"TQJZMOLQ\n"
"BKNGFEJZWMJNJLSTUBHCFHXWMYUPZM\n"
"YNWEPZKNBOOXNZVWKIUS\n"
"LV\n"
"CJDFYDMYZVOEW\n"
"TMHEJLIDEHT\n"
"KGTGFIFWYTKPWTYQQPGKRRYFXN\n";
int main()
{
std::istringstream strm(test);
// Read in the data
std::multimap<int, std::string> strmap;
int N;
strm >> N;
std::string word;
for (int i = 0; i < N; ++i)
{
strm >> word;
// get the average using std::accumulate and divide by the length of the word
int avg = std::accumulate(word.begin(), word.end(), 0,
[&](int total, char val) { return total + val - 'A' + 1; }) / word.length();
// insert this value in the map
strmap.insert({ avg, word });
}
// output results
for (auto& w : strmap)
std::cout << w.second << "\n";
}
Output:
TMHEJLIDEHT
PQUPOGTJETGXDQDEMGPNMJEBI
BKNGFEJZWMJNJLSTUBHCFHXWMYUPZM
CJDFYDMYZVOEW
RRPPNG
COSOPYILSPKNKZSTUZVMEERQDL
KGTGFIFWYTKPWTYQQPGKRRYFXN
TQJZMOLQ
YNWEPZKNBOOXNZVWKIUS
LV
The std::accumulate is used to add up the values to get the average.
Or just order them in the end (You won't need the 2nd array):
for (int i = 0; i < N; i++) {
for (int j = i + 1; j < N; j++) {
int sumA = 0, sumB = 0;
for (int k = 0; k < words[i].size(); k++)
sumA += words[i][k] - 'A' + 1;
for (int k = 0; k < words[j].size(); k++)
sumB += words[j][k] - 'A' + 1;
if (sumA / words[i].size() > sumB / words[j].size())
swap(words[i], words[j]);
}
}
As they shown above, it's way better to use a vector to store your data.

Skipping vector iterations based on index equality

Let's say I have three vectors.
#include <vector>
vector<long> Alpha;
vector<long> Beta;
vector<long> Gamma;
And let's assume I've filled them up with numbers, and that we know they're all the same length. (and we know that length ahead of time - let's say it's 3.)
What I want to have at the end is the minimum of all sums Alpha[i] + Beta[j] + Gamma[k] such that i, j, and k are all unequal to each other.
The naive approach would look something like this:
#include <climits>
long min = LONG_MAX;
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
for (int k=0; k < 3; k++) {
if (i != j && i != k && j != k) {
long sum = Alpha[i] + Beta[j] + Gamma[k];
if (sum < min)
min = sum;
}
}
}
}
Frankly, that code doesn't feel right. Is there a faster and/or more elegant way - one that skips the redundant iterations?
The computational complexity of your algorithm is an O(N^3). You can save a very small bit by using:
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
if ( i == j )
continue;
long sum1 = Alpha[i] + Beta[j];
for (int k=0; k < 3; k++) {
if (i != k && j != k) {
long sum2 = sum1 + Gamma[k];
if (sum2 < min)
min = sum2;
}
}
}
}
However, the complexity of the algorithm is still O(N^3).
Without the if ( i == j ) check, the innermost loop will be executed N^2 times. With that check, you will be able to avoid the innermost loop N times. It will be executed N(N-1) times. The check is almost not worth it .
If you can temporarily modify the input vectors, you can swap the used values with the end of the vectors, and just iterate over the start of the vectors:
for (int i = 0; i < size; i++) {
std::swap(Beta[i],Beta[size-1]); // swap the used index with the last index
std::swap(Gamma[i],Gamma[size-1]);
for (int j = 0; j < size-1; j++) { // don't try the last index
std::swap(Gamma[j],Gamma[size-2]); // swap j with the 2nd-to-last index
for (int k=0; k < size-2; k++) { // don't try the 2 last indices
long sum = Alpha[i] + Beta[j] + Gamma[k];
if (sum < min) {
min = sum;
}
}
std::swap(Gamma[j],Gamma[size-2]); // restore values
}
std::swap(Beta[i],Beta[size-1]); // restore values
std::swap(Gamma[i],Gamma[size-1]);
}

Find a subarray of m*m (2<=m<n) having largest sum; out of an n*n int array(having +ve, -ve, 0s)

I have written a solution for the above problem but can someone please suggest an optimized way.
I have traversed through the array for count(2 to n) where count is finding subarrays of size count*count.
int n = 5; //Size of array, you may take a dynamic array as well
int a[5][5] = {{1,2,3,4,5},{2,4,7,-2,1},{4,3,9,9,1},{5,2,6,8,0},{5,4,3,2,1}};
int max = 0;
int **tempStore, size;
for(int count = 2; count < n; count++)
{
for(int i = 0; i <= (n-count); i++)
{
for(int j = 0; j <= (n-count); j++)
{
int **temp = new int*[count];
for(int i = 0; i < count; ++i) {
temp[i] = new int[count];
}
for(int k = 0; k < count; k++)
{
for(int l = 0; l <count; l++)
{
temp[k][l] = a[i+k][j+l];
}
}
//printing fetched array
int sum = 0;
for(int k = 0; k < count; k++)
{
for(int l = 0; l <count; l++)
{
sum += temp[k][l];
cout<<temp[k][l]<<" ";
}cout<<endl;
}cout<<"Sum = "<<sum<<endl;
if(sum > max)
{
max = sum;
size = count;
tempStore = new int*[count];
for(int i = 0; i < count; ++i) {
tempStore[i] = new int[count];
}
//Locking the max sum array
for(int k = 0; k < count; k++)
{
for(int l = 0; l <count; l++)
{
tempStore[k][l] = temp[k][l];
}
}
}
//printing finished
cout<<"------------------\n";
//Clear temp memory
for(int i = 0; i < size; ++i) {
delete[] temp[i];
}
delete[] temp;
}
}
}
cout<<"Max sum is = "<<max<<endl;
for(int k = 0; k < size; k++)
{
for(int l = 0; l <size; l++)
{
cout<<tempStore[k][l]<<" ";
}cout<<endl;
}cout<<"-------------------------";
//Clear tempStore memory
for(int i = 0; i < size; ++i) {
delete[] tempStore[i];
}
delete[] tempStore;
Example:
1 2 3 4 5
2 4 7 -2 1
4 3 9 9 1
5 2 6 8 0
5 4 3 2 1
Output:
Max sum is = 71
2 4 7 -2
4 3 9 9
5 2 6 8
5 4 3 2
This is a problem best solved using Dynamic Programming (DP) or memoization.
Assuming n is significantly large, you will find that recalculating the sum of every possible combination of matrix will take too long, therefore if you could reuse previous calculations that would make everything much faster.
The idea is to start with the smaller matrices and calculate sum of the larger one reusing the precalculated value of the smaller ones.
long long *sub_solutions = new long long[n*n*m];
#define at(r,c,i) sub_solutions[((i)*n + (r))*n + (c)]
// Winner:
unsigned int w_row = 0, w_col = 0, w_size = 0;
// Fill first layer:
for ( int row = 0; row < n; row++) {
for (int col = 0; col < n; col++) {
at(r, c, 0) = data[r][c];
if (data[r][c] > data[w_row][w_col]) {
w_row = r;
w_col = c;
}
}
}
// Fill remaining layers.
for ( int size = 1; size < m; size++) {
for ( int row = 0; row < n-size; row++) {
for (int col = 0; col < n-size; col++) {
long long sum = data[row+size][col+size];
for (int i = 0; i < size; i++) {
sum += data[row+size][col+i];
sum += data[row+i][col+size];
}
sum += at(row, col, size-1); // Reuse previous solution.
at(row, col, size) = sum;
if (sum > at(w_row, w_col, w_size)) { // Could optimize this part if you only need the sum.
w_row = row;
w_col = col;
w_size = size;
}
}
}
}
// The largest sum is of the sub_matrix starting a w_row, w_col, and has dimensions w_size+1.
long long largest = at(w_row, w_col, w_size);
delete [] sub_solutions;
This algorithm has complexity: O(n*n*m*m) or more precisely: 0.5*n*(n-1)*m*(m-1). (Now I haven't tested this so please let me know if there are any bugs.)
Try this one (using naive approach, will be easier to get the idea):
#include <iostream>
#include<vector>
using namespace std;
int main( )
{
int n = 5; //Size of array, you may take a dynamic array as well
int a[5][5] =
{{2,1,8,9,0},{2,4,7,-2,1},{5,4,3,2,1},{3,4,9,9,2},{5,2,6,8,0}};
int sum, partsum;
int i, j, k, m;
sum = -999999; // presume minimum part sum
for (i = 0; i < n; i++) {
partsum = 0;
m = sizeof(a[i])/sizeof(int);
for (j = 0; j < m; j++) {
partsum += a[i][j];
}
if (partsum > sum) {
k = i;
sum = partsum;
}
}
// print subarray having largest sum
m = sizeof(a[k])/sizeof(int); // m needs to be recomputed
for (j = 0; j < m - 1; j++) {
cout << a[k][j] << ", ";
}
cout << a[k][m - 1] <<"\nmax part sum = " << sum << endl;
return 0;
}
With a cumulative sum, you may compute partial sum in constant time
std::vector<std::vector<int>>
compute_cumulative(const std::vector<std::vector<int>>& m)
{
std::vector<std::vector<int>> res(m.size() + 1, std::vector<int>(m.size() + 1));
for (std::size_t i = 0; i != m.size(); ++i) {
for (std::size_t j = 0; j != m.size(); ++j) {
res[i + 1][j + 1] = m[i][j] - res[i][j]
+ res[i + 1][j] + res[i][j + 1];
}
}
return res;
}
int compute_partial_sum(const std::vector<std::vector<int>>& cumulative, std::size_t i, std::size_t j, std::size_t size)
{
return cumulative[i][j] + cumulative[i + size][j + size]
- cumulative[i][j + size] - cumulative[i + size][j];
}
live example

Selection sort ascending

That is my function:
int main() {
double data[100];
int num;
cout<<"num= ";
cin>>num;
for(int i = 1; i <= num; i++) {
cout<<i<<" element = ";
cin>>data[i];
}
Sort(data, num);
for (int i = 1; i <= num; i++) {
cout<<data[i]<<endl;
}
return 0;
}
void Sort(double data[], int n) {
int i,j,k;
double min;
for(i = 0; i < n-1; i++) {
k = i;
min = data[k];
for(j = i+1; j < n; j++)
if(data[j] < min) {
k = j;
min = data[k];
}
data[k] = data[i];
data[i] = min;
}
}
if I write for exp. three elements: 8,9,1 again cout 8,9,1?
for(int i = 1; i <= num; i++) { // WRONG
I think you mean:
for(int i = 0; i < num; i++) { // RIGHT
Arrays in C are 0-indexed remember.
Your sorting function is fine. The only problem is that you enter elements at positions 1 through n, inclusive, while you should use 0 through n-1, inclusive, in both loops of the main() function.
If you need to print numbers 1 through n, use
cout<<(i+1)<<" element = ";
You should get used of the 0 index begin in the for loop
for(int i = 0; i < N; ++i)
so fixing these two index errors will make your code run properly.
the reason is:
if you write data to data[] using 1 as the begining, your data array's first item will be a random number:
if you insert 3 elements, the array will be like this:
data[0] = ??? // maybe a very very big number
data[1] = 8
data[2] = 9
data[3] = 1
and in your Sort function, your index begins at 0 and ends before num, that means your code would only sort data[0], data[1], data[2].
if you use: num = 3, 3 2 1 as your input data for the origin code you could see that 3 and 2 is sorted
I guess your Sort code is googled from somewhere, please try to understand it.
Good online algorithm course: https://www.coursera.org/course/algs4partI
a very good algorithm online book: http://algs4.cs.princeton.edu/home/
btw, for(j = i+1; j < n; j++) in the Sort function would be better if it has { } braces.

Sorting an array fail

I'm trying to sort an array made of random numbers from 1 to 10 in an ascending order. I've come up with this function:
void Sort(int a[10], int n)
{
int j = 0;
for (int i = 0; i < n-1; i++)
{
j = i+1;
if (a[i] > a[j])
{
int aux = a[i];
a[i] = a[j];
a[j] = aux;
}
}
}
But when I try to output the array, the function doesn't seem to have worked:
Sort(array, 10);
cout<<endl;
for (int i = 0; i < 10; i++)
{
cout<<array[i]<<" ";
}
The algorithm in your Sort function is wrong. It doesn't sort at all.
Anyway, don't reinvent the wheel, better use std::sort as:
#include <algorithm>
std::sort(array, array+10);
As for your Sort function, which you want to implement using bubble-sort algorithm, possibly for learning purpose. the correct implementation is this:
void Sort(int *a, int n)
{
for (int i = 0; i < n ; i++)
{
for (int j = i + 1; j < n ; j++)
{
if (a[i] > a[j])
{
int aux = a[i];
a[i] = a[j];
a[j] = aux;
}
}
}
}
You are only making n swaps. You need an outer loop on sort (assuming it's bubble sort) so that you continue doing that until you stop doing swaps.
bool Sort(int a[10], int n)
{
bool swapped = false;
int j = 0;
for (int i = 0; i < n-1; i++)
{
j = i+1;
if (a[i] > a[j])
{
int aux = a[i];
a[i] = a[j];
a[j] = aux;
swapped = true;
}
}
return swapped;
}
int main(int argc, char** argv) {
int a[10] = {5,4,3,1,2,6,7,8,9,10};
while (Sort(a,10));
for (int i=0;i<10;++i) {
std::cout << a[i] << std::endl;
}
}
That only does one pass over the data, here is an example showing you what happens
8 7 9 2 3 4 5
After going through your function the result would be
7 8 2 3 4 5 9