I have an unsorted array. I have numerous queries in which I give a range (expressed as two array indexes) and then the maximum value from that range (that is, from the specified slice of the array) has to be returned.
For example:
array[]={23,17,9,45,78,2,4,6,90,1};
query(both inclusive): 2 6
answer: 78
Which algorithm or data structure do I construct to quickly retrieve maximum value from any range. (There are a lot of queries)
EDIT:
I am using C++
I think that some preprocessing is allowed. It is Range Minimum Query problem (maximum here).
Good review of this problem at TopCoder.
Suitable data structures: Segment tree and Sqrt-decomposition:
#include <cstdio>
#include <iostream>
#include <algorithm>
#define N int(3e4)
using namespace std;
int act[N], len, sz, res[N];
int answer(int l, int r) {
int ret = -1, i;
for (i = l; i % sz && i <= r; i++)
ret = max(ret, act[i]);
for (; i + sz <= r + 1; i += sz)
ret = max(ret, res[i / sz]);
for (; i <= r; i++)
ret = max(ret, act[i]);
return ret;
}
int main() {
int i, m;
cin >> m;
for (i = 0; ; i++) {
cin >> act[i];
if (act[i] == -1)
break;
}
len = i;
for (sz = 1; sz * sz < len; sz++);
for (int j = i + 1; j < sz * sz; j++)
act[j] = -1;
for (int i = 0; i < sz * sz; i++)
res[i / sz] = max(res[i / sz], act[i]);
for (int i = 0; i + m <= len; i++)
cout << answer(i, i + m - 1) << endl;
return 0;
}
mergesort n get the last index value of range as will me max.
array[]={23,17,9,45,78,2,4,6,90,1};
query(both inclusive): 2 6
mergesort return 6th index var (index = 1 .. n)
answer: 78
Let say this is your array array[]={23,17,9,45,78,2,4,6,90,1};
If your array is not that big, I would offer you preprocess the array and get another array like that:
{0,0} = 23; //between arr[0] and arr[0]
{0,1} = 23;
{0,2} = 23;
{0,3} = 45;
{9,9} = 1;
So your new array is going to be newArr = {23,23,23,45,....., 1}
You can find search in O(1), for example, max between 4-5 is newArr[4*array.length+5)-1];
In total, for n queries you will have O(n).
The space is if you have 10000(10^4) integer, then your newArr = 10^8 * 4B = 400MB, so if you have more than 10000 int, then this wouldnt work
EDIT: I thought of something but it is same as algorithm in Topcoder that MBo mentions.
Related
I have this problem I'm curious about where I have an Array and I need to compute the Sum of this function:
Arr[L] + (Arr[L] ^ Arr[L+1]) + ... + (Arr[L] ^ Arr[L+1] ^ ... ^ Arr[R])
Example:
If the Array given was: [1, 2, 3, 5] and I asked what's the sum on the range [L = 1, R = 3] (assuming 1-based Index), then it'd be:
Sum = 1 + (1 ^ 2) + (1 ^ 2 ^ 3) = 4
In this problem, the Array, the size of the Array, and the Ranges are given. My approach for this is too slow.
There's also a variable called Q which indicates the number of Queries that would process each [L, R].
What I have:
I XOR'ed each element and then summed it to a variable within the range of [L, R]. Is there any faster way to compute this if the elements in the Array are suppose... 1e18 or 1e26 larger?
#include <iostream>
#include <array>
int main (int argc, const char** argv)
{
long long int N, L, R;
std::cin >> N;
long long int Arr[N];
for (long long int i = 0; i < N; i++)
{
std::cin >> Arr[i];
}
std::cin >> L >> R;
long long int Summation = 0, Answer = 0;
for (long long int i = L; i <= R; i++)
{
Answer = Answer ^ Arr[i - 1];
Summation += Answer;
}
std::cout << Summation << '\n';
return 0;
}
There are two loops in your code:
for (long long int i = 0; i < N; i++)
{
std::cin >> Arr[i];
}
long long int Summation = 0, Answer = 0;
for (long long int i = L; i <= R; i++)
{
Answer = Answer ^ Arr[i - 1];
Summation += Answer;
}
The second loop is smaller, and only does two operations (^= and +). These are native CPU instructions; this will be memory bound on the sequential access of Arr[]. You can't speed this up. You need all elements, and it doesn't get faster than a single sequential scan. The CPU prefetcher will hit maximum memory bandwidth.
However, the killer is the first loop. Parsing digits takes many, many more operations, and the range is even larger.
Disclaimer : NOT A FASTER SOLUTION !
Changing a bit the subject by making L and R valid indices of an integer matrix ( range [0, size) ), the following function is working for me:
size_t xor_rec(size_t* array, size_t size, size_t l, size_t r) {
if (l < 0 || r >= size || l > r) {
return 0; // error control
}
if (r > l + 1) {
size_t prev_to_prev_sum = xor_rec(array, size, l, r - 2);
size_t prev_sum = xor_rec(array, size, l, r - 1);
return prev_sum + ((prev_sum - prev_to_prev_sum) ^ array[r]);
}
if (r == l + 1) {
return array[r - 1] + (array[r - 1] ^ array[r]);
}
if (r == l) {
return array[r];
}
return 0;
}
Edit: changed int for size_t.
If indices are 0 based. That is L=0 implies the first element: Arr[0] is the first element in the array, then it's simply this:
int sum = 0;
int prev = 0;
for (int i = L; i <= R; i++)
{
int current = (prev ^ Arr[i]);
sum += current;
prev = current;
}
If it's 1 based, where L=1 is really Arr[0], then it's a quick adjustment:
int sum = 0;
int prev = 0;
for (int i = L; i <= R; i++)
{
int current = (prev ^ Arr[i-1]);
sum += current;
prev = current;
}
Given an integer n and array a. Finding maximum of (a[i]+a[j])*(j-i) with 1<=i<=n-1 and i+1<=j<=n
Example:
Input
5
1 3 2 5 4
Output
21
Explanation :With i=2 and j=5, we have the maximum of (a[i]+a[j])*(j-i) is (3+4)*(5-2)=21
Constraints:
n<=10^6
a[i]>0 with 1<=i<=n
I can solve this problem with n<=10^4, but what should I do if n is too large, like the constraints?
First, let's reference the "brute force" force algorithm. This will have some issues, that I will call out below, but it is a correct solution.
struct Result
{
size_t i;
size_t j;
int64_t value;
};
Result findBestBruteForce(const vector<int>& a)
{
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
for (size_t i = 0; i < a.size(); i++)
{
for (size_t j = i + 1; j < a.size(); j++)
{
// do the math in 64-bit space to avoid overflow
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
}
}
}
return { besti, bestj, bestvalue };
}
The problem with the above code is that it runs at O(N²). Or more precisely, for the the N iterations of the outer for-loop (where i goes from 0 to N), there are an average of N/2 iterations on the inner for-loop. If N is small, this isn't a problem.
On my PC, with full optimizations turned on. When is N under 20000, the run time is less than a second. Once N approaches 100000, it takes several seconds to process the 5 billion iterations. Let's just go with a "billion operations per second" as an expected rate. If N were to 1000000, the maximum as the OP outlined, it would probably take 500 seconds. Such is the nature of a N-squared algorithm.
So how can we speed it up? Here's an interesting observation. Let's say our array was this:
10 5 4 15 13 100 101 6
On the first iteration of the outer loop above, where i=0, we'd be computing this on each iteration of the inner loop:
for each j: (a[0]+a[j])(j-0)
for each j: (10+a[j])(j-0)
for each j: [15*1, 14*2, 25*3, 23*4, 1000*5, 1010*6, 16*6]
= [15, 28, 75, 92, 5000, 6060, 96]
Hence, for when i=0, a[i] = 15 and the largest value computed from that set is 6060.
Since A[0] is 15, and we're tracking a current "best" value, there's no incentive to iterate all the values again for i=1 since a[1]==14 is less than 15. There's no j index that would compute a value of (a[1]+a[j])*(j-1) larger than what's already been found. Because (14+a[j])*(j-1) will always be less than (15+a[j])*(j-1). (Assumes all values in the array are non-negative).
So to generalize, the outer loop can skip over any index of i where A[best_i] > A[i]. And that's a real simple alteration to our above code:
Result findBestOptimized(const std::vector<int>& a)
{
if (a.size() < 2)
{
return {0,0,INT64_MIN};
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++)
{
if (a[i] <= minimum)
{
continue;
}
for (size_t j = i + 1; j < a.size(); j++)
{
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
minimum = a[i];
}
}
}
return { besti, bestj, bestvalue };
}
Above, we introduce a minimum value for A[i] to be before considering doing the full inner loop enumeration.
I benchmarked this with build optimizations on. On a random array of a million items, it runs in under a second.
But wait... there's another optimization!
If the inner loop fails to find an index j such that value > bestvalue, then we already know that the current A[i] is greater than minimum. Hence, we can increment minimum to A[i] regardless at the end of the inner loop.
Now, I'll present the final solution:
Result findBestOptimizedEvenMore(const std::vector<int>& a)
{
if (a.size() < 2)
{
return { 0,0,INT64_MIN };
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++)
{
if (a[i] <= minimum)
{
continue;
}
for (size_t j = i + 1; j < a.size(); j++)
{
int64_t value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue)
{
bestvalue = value;
besti = i;
bestj = j;
}
}
minimum = a[i]; // since we know a[i] > minimum, we can do this
}
return { besti, bestj, bestvalue };
}
I benchmarked the above solution on different array sizes from N=100 to N=1000000. It does all iterations in under 25 milliseconds.
In the above solution, there's likely a worst case runtime of O(N²) again when all the items in the array are in ascending order. But I believe the average case should be on the order of O(N lg N) or better. I'll do some more analysis later if anyone is interested.
Note: Some notation for variables and the Result class in the code have been copied from #selbie's excellent answer.
Here's another O(n^2) worst-case solution with (likely provable) O(n) expected performance on random permutations and room for optimization.
Suppose [i, j] are our array bounds for an optimal pair. By the problem definition, this means all elements left of i must be strictly less than A[i], and all elements right of j must be strictly less than A[j].
This means we can compute the left-maxima of A: all elements strictly greater than all previous elements, as well as the right-maxima of A. Then, we only need to consider left endpoints from the left-maxima and right endpoints from the right-maxima.
I don't know the expectation of the product of the sizes of left and right maxima sets, but we can get an upper bound. The size of left maxima is at most the size of the longest increasing subsequence (LIS) of A. The right maxima are at most the size of the longest decreasing subsequence. These aren't independent, but I'm taking as an (unproven) assumption that the LIS and LDS lengths are inversely correlated with each other for random permutations. The right-maxima must start after the left-maxima end, so this seems like a safe assumption.
The length of the LIS for random permutations follows the Tracy-Widom distribution, so it has mean sqrt(2N) and standard deviation N^(-1/6). The expected square of the size is therefore 2N + 1/(N^1/3) so ~2N. This isn't exactly the proof we wanted, since you'd need to sum over the partial density function to be rigorous, but the LIS is already an upper bound on the left-maxima size, so I think the conclusion is still true.
C++ code (Result class and some variable names taken from selbie's post, as mentioned):
struct Result
{
size_t i;
size_t j;
int64_t value;
};
Result find_best_sum_size_product(const std::vector<int>& nums)
{
/* Given: list of positive integers nums
Returns: Tuple with (best_i, best_j, best_product)
where best_i and best_j maximize the product
(nums[i]+nums[j])*(j-i) over 0 <= i < j < n
Runtime: O(n^2) worst case,
O(n) average on random permutations.
*/
int n = nums.size();
if (n < 2)
{
return {0,0,INT64_MIN};
}
std::vector<int> left_maxima_indices;
left_maxima_indices.push_back(0);
for (int i = 1; i < n; i++){
if (nums.at(i) > nums.at(left_maxima_indices.back())) {
left_maxima_indices.push_back(i);
}
}
std::vector<int> right_maxima_indices;
right_maxima_indices.push_back(n-1);
for (int i = n-1; i >= 0; i--){
if (nums.at(i) > nums.at(right_maxima_indices.back())) {
right_maxima_indices.push_back(i);
}
}
size_t best_i = 0;
size_t best_j = 0;
int64_t best_product = INT64_MIN;
int i = 0;
int j = 0;
for (size_t left_idx = 0;
left_idx < left_maxima_indices.size();
left_idx++)
{
i = left_maxima_indices.at(left_idx);
for (size_t right_idx = 0;
right_idx < right_maxima_indices.size();
right_idx++)
{
j = right_maxima_indices.at(right_idx);
if (i == j) continue;
int64_t value = (nums.at(i) + (int64_t)nums.at(j)) * (j - i);
if (value > best_product)
{
best_product = value;
best_i = i;
best_j = j;
}
}
}
return { best_i, best_j, best_product };
}
I started from the two excellent answers by #selbie and #kcsquared.
Their solutions gave impressive results for random inputs. What was not clear is the worst case behavior.
What sequence would correspsond to the worst case?
I finally found a critial sequence for these two answers, a triangle sequence: this sequence slightly increases up to a max, and then slightly decrease. With such a sequence and n=10^5 for example, these answers take more than 10s.
My solutions starts from #selbie solution and add two improvements:
I add #kcsquared's trick: on the right (of j), they can be only lower elements
When considering a new left element a[i], it is useless to start from i + 1 to get the second element. We can start from the current best_j
With these tricks, I was able to improve the two posted answer performances a little bit. However, it still
fails to solve the triangle sequence issue: about 10s for n = 10^5.
#include <iostream>
#include <vector>
#include <string>
#include <cstdlib>
#include <ctime>
#include <chrono>
struct Result {
size_t i;
size_t j;
int64_t value;
};
void print (const Result& res, const std::string& prefix = "") {
std::cout << prefix;
std::cout << "(" << res.i << ", " << res.j << ") -> " << res.value << std::endl;
}
Result findBest(const std::vector<int>& a) {
if (a.size() < 2) {
return { 0, 0, INT64_MIN };
}
int n = a.size();
std::vector<int> next_max(n, -1);
int current_max = n-1;
for (int i = n-1; i >= 0; --i) {
if (a[i] > a[current_max]) {
current_max = i;
}
next_max[i] = current_max;
}
size_t besti = 0;
size_t bestj = 0;
int64_t bestvalue = INT64_MIN;
int minimum = INT_MIN;
for (size_t i = 0; i < a.size(); i++) {
if (a[i] <= minimum) {
continue;
}
minimum = a[i];
size_t jmin = (bestj > i) ? bestj : i+1;
for (size_t j = jmin; j < a.size(); j++) {
j = next_max[j];
value = (a[i] + (int64_t)a[j]) * (j - i);
if (value > bestvalue) {
bestvalue = value;
besti = i;
bestj = j;
}
}
}
return { besti, bestj, bestvalue };
}
int main() {
int n = 1000000;
int vmax = 100000000;
std::vector<int> A (n);
std::srand(std::time(0));
for (int i = 0; i < n; ++i) {
A[i] = rand() % vmax + 1;
}
std::cout << "n = " << n << std::endl;
auto t0 = std::chrono::high_resolution_clock::now();
auto res = findBest (A);
auto t1 = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count();
print (res, "Random: ");
std::cout << "time = " << duration/1000 << " ms" << std::endl;
int i_max = n/2;
for (int i = 0; i < i_max; ++i) A[i] = i+1;
A[i_max] = 10 * i_max;
for (int i = i_max+1; i < n; ++i) {
A[i] = 2*i_max - i;
}
t0 = std::chrono::high_resolution_clock::now();
res = findBest (A);
t1 = std::chrono::high_resolution_clock::now();
duration = std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count();
print (res, "Triangle sequence: ");
std::cout << "time = " << duration/1000 << " ms" << std::endl;
return 0;
}
Rod-cutting problem(there is a rod of length n where n > 0, n is an integer, and we want to cut it into pieces of integer lengths such that the total price is maximized), p is the list of price, n is the length of the rod. I want to cut the rod, in order to get the maximum price, at the meantime, we also need to ensure that the length is unqiue, that is if we already cut a piece length = 3, we cannot cut another piece length = 3.
For example vector p = {1, 5, 8, 9, 10, 12, 17, 20}; Gives me max price: 21 and length are: 2,3,3. As stated, there shouldn't be double 3. So the result should be 20 and length is 8 instead of 2,3,3
How could I modify my code, and maintain the time complexity O(n^2) Thanks.
int n = 8;
vector<int> p = {1,5,8,9,10,12,17,20};
void cut_rod(vector<int>& p, int n){
int r[n+1];
int s[n+1];
r[0] = 0;
for (int j = 1; j<=n; j++){
int q = INT_MIN;
for (int i = 1; i <= j; i++){
if(q < p[i-1] + r[j-i]){
q = p[i-1] + r[j-i];
s[j] = i;
}
}
r[j] = q;
}
return r[n];
}
You could use a n + 1 by n + 1 matrix when you store the pieces for a given length. This way you can check if you have a same sized piece in constant time and copying a row costs linear time, so in total the complexity is still O(n^2), but now your space complexity is O(n^2).
I modified the code by geeksforgeeks below.
// A Dynamic Programming solution for Rod cutting problem
#include<stdio.h>
#include<limits.h>
using namespace std;
// A utility function to get the maximum of two integers
int max(int a, int b) { return (a > b)? a : b;}
/* Returns the best obtainable price for a rod of length n and
price[] as prices of different pieces */
int cutRod(int price[], int n)
{
int pieces[n+1][n+1];
int val[n+1];
val[0] = 0;
int i, j;
// Build the table val[] in bottom up manner and return the last entry
// from the table
for (i = 1; i<=n; i++)
{
int max_val = INT_MIN, ind = -1;
for (j = 1; j <= i; j++) {
if (max_val < price[j - 1] + val[i-j]) {
if (pieces[i-j][j] != 1) {
max_val = price[j - 1] + val[i-j];
ind = j;
}
}
}
val[i] = max_val;
for (int k = 0; k <= n; ++k) { // Copy the pieces
pieces[i][k] = pieces[i-ind][k];
}
pieces[i][ind] = 1; // Add the piece of length ind (which is the max j)
}
return val[n];
}
/* Driver program to test above functions */
int main()
{
int arr[] = {1,5,8,9,10,12,17,20};
int size = sizeof(arr)/sizeof(arr[0]);
printf("Maximum Obtainable Value is %dn", cutRod(arr, size));
getchar();
return 0;
}
The DP algorithm saves in the i-th position of the array val, that goes from 0 to n, the maximum price for a rod with length i. We save the cuts of a rod of length i in pieces[i] which is an array that goes from 0 to n, if we have 1 at the position j that means to get the max value val[i] you must have a piece of length j. Now, the DP algorithm for some length i makes a cut of length j and calculates the sum of the price of a price of length j and the max price of the remainder piece of length i-j which is already calculated. This sum will have a max value for some j, meaning there will be some j that price[j - 1] + val[i-j] will be max (where j isn't an already existing cut). So now for length i we have a piece of length j and the pieces for length i - j which we have saved at pieces[i - j]. Now to get pieces[i] we have to copy the pieces pieces[i - j] and add the piece of length j.
You can get the length of the pieces like that
for (int i = 0; i <= n; ++i)
if (pieces[n][i] == 1) cout << i << ' ';
I need a way to solve the classic 5SUM problem without hashing or with a memory efficient way of hashing.
The problem asks you to find how many subsequences in a given array of length N have the sum equal to S
Ex:
Input
6 5
1 1 1 1 1 1
Output
6
The restrictions are:
N <= 1000 ( size of the array )
S <= 400000000 ( the sum of the subsequence )
Memory usage <= 5555 kbs
Execution time 2.2s
I'm pretty sure the excepted complexity is O(N^3). Due to the memory limitations hashing doesn't provide an actual O(1) time.
The best I got was 70 points using this code. ( I got TLE on 6 tests )
#include <iostream>
#include <fstream>
#include <algorithm>
#include <vector>
#define MAX 1003
#define MOD 10472
using namespace std;
ifstream in("take5.in");
ofstream out("take5.out");
vector<pair<int, int>> has[MOD];
int v[MAX];
int pnt;
vector<pair<int, int>>::iterator it;
inline void ins(int val) {
pnt = val%MOD;
it = lower_bound(has[pnt].begin(), has[pnt].end(), make_pair(val, -1));
if(it == has[pnt].end() || it->first != val) {
has[pnt].push_back({val, 1});
sort(has[pnt].begin(), has[pnt].end());
return;
}
it->second++;
}
inline int get(int val) {
pnt = val%MOD;
it = lower_bound(has[pnt].begin(), has[pnt].end(), make_pair(val, -1));
if(it == has[pnt].end() || it->first != val)
return 0;
return it->second;
}
int main() {
int n,S;
int ach = 0;
int am = 0;
int rez = 0;
in >> n >> S;
for(int i = 1; i <= n; i++)
in >> v[i];
sort(v+1, v+n+1);
for(int i = n; i >= 1; i--) {
if(v[i] > S)
continue;
for(int j = i+1; j <= n; j++) {
if(v[i]+v[j] > S)
break;
ins(v[i]+v[j]);
}
int I = i-1;
if(S-v[I] < 0)
continue;
for(int j = 1; j <= I-1; j++) {
if(S-v[I]-v[j] < 0)
break;
for(int k = 1; k <= j-1; k++) {
if(S-v[I]-v[j]-v[k] < 0)
break;
ach = S-v[I]-v[j]-v[k];
rez += get(ach);
}
}
}
out << rez << '\n';
return 0;
}
I think it can be done. We are looking for all subsets of 5 items in the array arr with the correct SUM. We have array with indexes 0..N-1. Third item of those five can have index i in range 2..N-3. We cycle through all those indexes. For every index i we generate all combinations of two numbers for index in range 0..i-1 on the left of index i and all combinations of two numbers for index in the range i+1..N-1 on the right of index i. For every index i there are less than N*N combinations on the left plus on the right side. We would store only sum for every combination, so it would not be more than 1000 * 1000 * 4 = 4MB.
Now we have two sequences of numbers (the sums) and task is this: Take one number from first sequence and one number from second sequence and get sum equal to Si = SUM - arr[i]. How many combinations are there? To do it efficiently, sequences have to be sorted. Say first is sorted ascending and have numbers a, a, a, b, c ,.... Second is sorted descending and have numbers Z, Z, Y, X, W, .... If a + Z > Si then we can throw Z away, because we do not have smaller number to match. If a + Z < Si we can throw away a, because we do not have bigger number to match. And if a + Z = Si we have 2 * 3 = 6 new combinations and get rid of both a and Z. If we get sorting for free, it is nice O(N^3) algorithm.
While sorting is not for free, it is O(N * N^2 * log(N^2)) = O(N^3 * log(N)). We need to do sorting in linear time, which is not possible. Or is it? In index i+1 we can reuse sequences from index i. There are only few new combinations for i+1 - only those that involve number arr[i] together with some number from index 0..i-1. If we sort them (and we can, because there are not N*N of them, but N at most), all we need is to merge two sorted sequences. And that can be done in linear time. We can even avoid sorting completely if we sort arr at the beginning. We just merge.
For second sequence the merging does not involve adding but removing, but it is very simmilar.
The implementation seems to work, but I expect there is off by one error somewhere ;-)
#include <iostream>
#include <fstream>
#include <algorithm>
#include <vector>
using namespace std;
int Generate(int arr[], int i, int sums[], int N, int NN)
{
int p1 = 0;
for (int i1 = 0; i1 < i - 1; ++i1)
{
int ai = arr[i1];
for (int i2 = i1 + 1; i2 < i; ++i2)
{
sums[p1++] = ai + arr[i2];
}
}
sort(sums, sums + p1);
return p1;
}
int Combinations(int n, int sums[], int p1, int p2, int NN)
{
int cnt = 0;
int a = 0;
int b = NN - p2;
do
{
int state = sums[a] + sums[b] - n;
if (state > 0) { ++b; }
else if (state < 0) { ++a; }
else
{
int cnta = 0;
int lastA = sums[a];
while (a < p1 && sums[a] == lastA) { a++; cnta++; }
int cntb = 0;
int lastB = sums[b];
while (b < NN && sums[b] == lastB) { b++; cntb++; }
cnt += cnta * cntb;
}
} while (b < NN && a < p1);
return cnt;
}
int Add(int arr[], int i, int sums[], int p2, int N, int NN)
{
int ii = N - 1;
int n = arr[i];
int nn = n + arr[ii--];
int ip = NN - p2;
int newP2 = p2 + N - i - 1;
for (int p = NN - newP2; p < NN; ++p)
{
if (ip < NN && (ii < i || sums[ip] > nn))
{
sums[p] = sums[ip++];
}
else
{
sums[p] = nn;
nn = n + arr[ii--];
}
}
return newP2;
}
int Remove(int arr[], int i, int sums[], int p1)
{
int ii = 0;
int n = arr[i];
int nn = n + arr[ii++];
int pp = 0;
int p = 0;
for (; p < p1 - i; ++p)
{
while (ii <= i && sums[pp] == nn)
{
++pp;
nn = n + arr[ii++];
}
sums[p] = sums[pp++];
}
return p;
}
int main() {
ifstream in("take5.in");
ofstream out("take5.out");
int N, SUM;
in >> N >> SUM;
int* arr = new int[N];
for (int i = 0; i < N; i++)
in >> arr[i];
sort(arr, arr + N);
int NN = (N - 3) * (N - 4) / 2 + 1;
int* sums = new int[NN];
int combinations = 0;
int p1 = 0;
int p2 = 1;
for (int i = N - 3; i >= 2; --i)
{
if (p1 == 0)
{
p1 = Generate(arr, i, sums, N, NN);
sums[NN - 1] = arr[N - 1] + arr[N - 2];
}
else
{
p1 = Remove(arr, i, sums, p1);
p2 = Add(arr, i + 1, sums, p2, N, NN);
}
combinations += Combinations(SUM - arr[i], sums, p1, p2, NN);
}
out << combinations << '\n';
return 0;
}
I am trying to solve this problem in spoj
I need to find the number of rotations of a given string that will make it lexicographically smallest among all the rotations.
For example:
Original: ama
First rotation: maa
Second rotation: aam This is the lexicographically smallest rotation so the answer is 2.
Here's my code:
string s,tmp;
char ss[100002];
scanf("%s",ss);
s=ss;
tmp=s;
int i,len=s.size(),ans=0,t=0;
for(i=0;i<len;i++)
{
string x=s.substr(i,len-i)+s.substr(0,i);
if(x<tmp)
{
tmp=x;
t=ans;
}
ans++;
}
cout<<t<<endl;
I am getting "Time Limit Exceeded" for this solution. I don't understand what optimizations can be made. How can I increase the speed of my solution?
You can use a modified suffix array. I mean modified because you must not stop on word end.
Here is the code for a similar problem I solved (SA is the suffix array):
//719
//Glass Beads
//Misc;String Matching;Suffix Array;Circular
#include <iostream>
#include <iomanip>
#include <cstring>
#include <string>
#include <cmath>
#define MAX 10050
using namespace std;
int RA[MAX], tempRA[MAX];
int SA[MAX], tempSA[MAX];
int C[MAX];
void suffix_sort(int n, int k) {
memset(C, 0, sizeof C);
for (int i = 0; i < n; i++)
C[RA[(i + k)%n]]++;
int sum = 0;
for (int i = 0; i < max(256, n); i++) {
int t = C[i];
C[i] = sum;
sum += t;
}
for (int i = 0; i < n; i++)
tempSA[C[RA[(SA[i] + k)%n]]++] = SA[i];
memcpy(SA, tempSA, n*sizeof(int));
}
void suffix_array(string &s) {
int n = s.size();
for (int i = 0; i < n; i++)
RA[i] = s[i];
for (int i = 0; i < n; i++)
SA[i] = i;
for (int k = 1; k < n; k *= 2) {
suffix_sort(n, k);
suffix_sort(n, 0);
int r = tempRA[SA[0]] = 0;
for (int i = 1; i < n; i++) {
int s1 = SA[i], s2 = SA[i-1];
bool equal = true;
equal &= RA[s1] == RA[s2];
equal &= RA[(s1+k)%n] == RA[(s2+k)%n];
tempRA[SA[i]] = equal ? r : ++r;
}
memcpy(RA, tempRA, n*sizeof(int));
}
}
int main() {
int tt; cin >> tt;
while(tt--) {
string s; cin >> s;
suffix_array(s);
cout << SA[0]+1 << endl;
}
}
I took this implementation mostly from this book. There is an easier to write O(n log²n) version, but may not be efficient enough for your case (n=10^5). This version is O(n log n), and it's not the most efficient algorithm. The wikipedia article lists some O(n) algorithms, but I find most of them too complex to write during a programming contest. This O(n log n) is usually enough for most problems.
You can find some slides explaining suffix array concept (from the author of the book I mentioned) here.
I know this comes very late but I stumbled across this from google on my search for an even faster variant of this algorithm. Turns out a good implementation is found at github: https://gist.github.com/MaskRay/8803371
It uses the lyndon factorization. That means it repeatly splits the string into lexicographically decreasing lyndon words. Lyndon word are strings that are (one of) the minimal rotations of themselves. Doing this in a circular way yields the lms of the string as the last found lyndon word.
int lyndon_word(const char *a, int n)
{
int i = 0, j = 1, k;
while (j < n) {
// Invariant: i < j and indices in [0,j) \ i cannot be the first optimum
for (k = 0; k < n && a[(i+k)%n] == a[(j+k)%n]; k++);
if (a[(i+k)%n] <= a[(j+k)%n]) {
// if k < n
// foreach p in [j,j+k], s_p > s_{p-(j-i)}
// => [j,j+k] are all suboptimal
// => indices in [0,j+k+1) \ i are suboptimal
// else
// None of [j,j+k] is the first optimum
j += k+1;
} else {
// foreach p in [i,i+k], s_p > s_{p+(j-i)}
// => [i,i+k] are all suboptimal
// => [0,j) and [0,i+k+1) are suboptimal
// if i+k+1 < j
// j < j+1 and indices in [0,j+1) \ j are suboptimal
// else
// i+k+1 < i+k+2 and indices in [0,i+k+2) \ (i+k+1) are suboptimal
i += k+1;
if (i < j)
i = j++;
else
j = i+1;
}
}
// j >= n => [0,n) \ i cannot be the first optimum
return i;
}