The task (from a Bulgarian judge, click on "Език" to change it to English):
I am given the size of the first (S1 = A) of N corals. The size of every subsequent coral (Si, where i > 1) is calculated using the formula (B*Si-1 + C)%D, where A, B, C and D are some constants. I am told that Nemo is nearby the Kth coral (when the sizes of all corals are sorted in ascending order).
What is the size of the above-mentioned Kth coral ?
I will have T tests and for every one of them I will be given N, K, A, B, C and D and prompted to output the size of the Kth coral.
The requirements:
1 ≤ T ≤ 3
1 ≤ K ≤ N ≤ 107
0 ≤ A < D ≤ 1018
1 ≤ C, B*D ≤ 1018
Memory available is 64 MB
Time limit is 1.9 sec
The problem I have:
For the worst case scenario I will need 107*8B which is 76 MB.
The solution If the memory available was at least 80 MB would be:
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
using biggie = long long;
int main() {
int t;
std::cin >> t;
int i, n, k, j;
biggie a, b, c, d;
std::vector<biggie>::iterator it_ans;
for (i = 0; i != t; ++i) {
std::cin >> n >> k >> a >> b >> c >> d;
std::vector<biggie> lut{ a };
lut.reserve(n);
for (j = 1; j != n; ++j) {
lut.emplace_back((b * lut.back() + c) % d);
}
it_ans = std::next(lut.begin(), k - 1);
std::nth_element(lut.begin(), it_ans, lut.end());
std::cout << *it_ans << '\n';
}
return 0;
}
Question 1: How can I approach this CP task given the requirements listed above ?
Question 2: Is it somehow possible to use std::nth_element to solve it since I am not able to store all N elements ? I mean using std::nth_element in a sliding window technique (If this is possible).
# Christian Sloper
#include <iostream>
#include <queue>
using biggie = long long;
int main() {
int t;
std::cin >> t;
int i, n, k, j, j_lim;
biggie a, b, c, d, prev, curr;
for (i = 0; i != t; ++i) {
std::cin >> n >> k >> a >> b >> c >> d;
if (k < n - k + 1) {
std::priority_queue<biggie, std::vector<biggie>, std::less<biggie>> q;
q.push(a);
prev = a;
for (j = 1; j != k; ++j) {
curr = (b * prev + c) % d;
q.push(curr);
prev = curr;
}
for (; j != n; ++j) {
curr = (b * prev + c) % d;
if (curr < q.top()) {
q.pop();
q.push(curr);
}
prev = curr;
}
std::cout << q.top() << '\n';
}
else {
std::priority_queue<biggie, std::vector<biggie>, std::greater<biggie>> q;
q.push(a);
prev = a;
for (j = 1, j_lim = n - k + 1; j != j_lim; ++j) {
curr = (b * prev + c) % d;
q.push(curr);
prev = curr;
}
for (; j != n; ++j) {
curr = (b * prev + c) % d;
if (curr > q.top()) {
q.pop();
q.push(curr);
}
prev = curr;
}
std::cout << q.top() << '\n';
}
}
return 0;
}
This gets accepted (Succeeds all 40 tests. Largest time 1.4 seconds, for a test with T=3 and D≤10^9. Largest time for a test with larger D (and thus T=1) is 0.7 seconds.).
#include <iostream>
using biggie = long long;
int main() {
int t;
std::cin >> t;
int i, n, k, j;
biggie a, b, c, d;
for (i = 0; i != t; ++i) {
std::cin >> n >> k >> a >> b >> c >> d;
biggie prefix = 0;
for (int shift = d > 1000000000 ? 40 : 20; shift >= 0; shift -= 20) {
biggie prefix_mask = ((biggie(1) << (40 - shift)) - 1) << (shift + 20);
int count[1 << 20] = {0};
biggie s = a;
int rank = 0;
for (j = 0; j != n; ++j) {
biggie s_vs_prefix = s & prefix_mask;
if (s_vs_prefix < prefix)
++rank;
else if (s_vs_prefix == prefix)
++count[(s >> shift) & ((1 << 20) - 1)];
s = (b * s + c) % d;
}
int i = -1;
while (rank < k)
rank += count[++i];
prefix |= biggie(i) << shift;
}
std::cout << prefix << '\n';
}
return 0;
}
The result is a 60 bits number. I first determine the high 20 bits with one pass through the numbers, then the middle 20 bits in another pass, then the low 20 bits in another.
For the high 20 bits, generate all the numbers and count how often each high 20 bits pattern occurrs. After that, add up the counts until you reach K. The pattern where you reach K, that pattern covers the K-th largest number. In other words, that's the result's high 20 bits.
The middle and low 20 bits are computed similarly, except we take the by then known prefix (the high 20 bits or high+middle 40 bits) into account. As a little optimization, when D is small, I skip computing the high 20 bits. That got me from 2.1 seconds down to 1.4 seconds.
This solution is like user3386109 described, except with bucket size 2^20 instead of 10^6 so I can use bit operations instead of divisions and think of bit patterns instead of ranges.
For the memory constraint you hit:
(B*Si-1 + C)%D
requires only the value (Si-2) before itself. So you can compute them in pairs, to use only 1/2 of total you need. This only needs indexing even values and iterating once for odd values. So you can just use half-length LUT and compute the odd value in-flight. Modern CPUs are fast enough to do extra calculations like these.
std::vector<biggie> lut{ a_i,a_i_2,a_i_4,... };
a_i_3=computeOddFromEven(lut[1]);
You can make a longer stride like 4,8 too. If dataset is large, RAM latency is big. So it's like having checkpoints in whole data search space to balance between memory and core usage. 1000-distance checkpoints would put a lot of cpu cycles into re-calculations but then the array would fit CPU's L2/L1 cache which is not bad. When sorting, the maximum re-calc iteration per element would be n=1000 now. O(1000 x size) maybe it's a big constant but maybe somehow optimizable by compiler if some constants really const?
If CPU performance becomes problem again:
write a compiling function that writes your source code with all the "constant" given by user to a string
compile the code using command-line (assuming target computer has some accessible from command line like g++ from main program)
run it and get results
Compiler should enable more speed/memory optimizations when those are really constant in compile-time rather than depending on std::cin.
If you really need to add a hard-limit to the RAM usage, then implement a simple cache with the backing-store as your heavy computations with brute-force O(N^2) (or O(L x N) with checkpoints every L elements as in first method where L=2 or 4, or ...).
Here's a sample direct-mapped cache with 8M long-long value space:
int main()
{
std::vector<long long> checkpoints = {
a_0, a_16, a_32,...
};
auto cacheReadMissFunction = [&](int key){
// your pure computational algorithm here, helper meant to show variable
long long result = checkpoints[key/16];
for(key - key%16 times)
result = iterate(result);
return result;
};
auto cacheWriteMissFunction = [&](int key, long long value){
/* not useful for your algorithm as it doesn't change behavior per element */
// backing_store[key] = value;
};
// due to special optimizations, size has to be 2^k
int cacheSize = 1024*1024*8;
DirectMappedCache<int, long long> cache(cacheSize,cacheReadMissFunction,cacheWriteMissFunction);
std::cout << cache.get(20)<<std::endl;
return 0;
}
If you use a cache-friendly sorting-algorithm, a direct cache access would make a lot of re-use for nearly all the elements in comparisons if you fill the output buffer/terminal with elements one by one by following something like a bitonic-sort-path (that is known in compile-time). If that doesn't work, then you can try accessing files as a "backing-store" of cache for sorting whole array at once. Is file system prohibited for use? Then the online-compiling method above won't work either.
Implementation of a direct mapped cache (don't forget to call flush() after your algorithm finishes, if you use any cache.set() method):
#ifndef DIRECTMAPPEDCACHE_H_
#define DIRECTMAPPEDCACHE_H_
#include<vector>
#include<functional>
#include<mutex>
#include<iostream>
/* Direct-mapped cache implementation
* Only usable for integer type keys in range [0,maxPositive-1]
*
* CacheKey: type of key (only integers: int, char, size_t)
* CacheValue: type of value that is bound to key (same as above)
*/
template< typename CacheKey, typename CacheValue>
class DirectMappedCache
{
public:
// allocates buffers for numElements number of cache slots/lanes
// readMiss: cache-miss for read operations. User needs to give this function
// to let the cache automatically get data from backing-store
// example: [&](MyClass key){ return redis.get(key); }
// takes a CacheKey as key, returns CacheValue as value
// writeMiss: cache-miss for write operations. User needs to give this function
// to let the cache automatically set data to backing-store
// example: [&](MyClass key, MyAnotherClass value){ redis.set(key,value); }
// takes a CacheKey as key and CacheValue as value
// numElements: has to be integer-power of 2 (e.g. 2,4,8,16,...)
DirectMappedCache(CacheKey numElements,
const std::function<CacheValue(CacheKey)> & readMiss,
const std::function<void(CacheKey,CacheValue)> & writeMiss):size(numElements),sizeM1(numElements-1),loadData(readMiss),saveData(writeMiss)
{
// initialize buffers
for(size_t i=0;i<numElements;i++)
{
valueBuffer.push_back(CacheValue());
isEditedBuffer.push_back(0);
keyBuffer.push_back(CacheKey()-1);// mapping of 0+ allowed
}
}
// get element from cache
// if cache doesn't find it in buffers,
// then cache gets data from backing-store
// then returns the result to user
// then cache is available from RAM on next get/set access with same key
inline
const CacheValue get(const CacheKey & key) noexcept
{
return accessDirect(key,nullptr);
}
// only syntactic difference
inline
const std::vector<CacheValue> getMultiple(const std::vector<CacheKey> & key) noexcept
{
const int n = key.size();
std::vector<CacheValue> result(n);
for(int i=0;i<n;i++)
{
result[i]=accessDirect(key[i],nullptr);
}
return result;
}
// thread-safe but slower version of get()
inline
const CacheValue getThreadSafe(const CacheKey & key) noexcept
{
std::lock_guard<std::mutex> lg(mut);
return accessDirect(key,nullptr);
}
// set element to cache
// if cache doesn't find it in buffers,
// then cache sets data on just cache
// writing to backing-store only happens when
// another access evicts the cache slot containing this key/value
// or when cache is flushed by flush() method
// then returns the given value back
// then cache is available from RAM on next get/set access with same key
inline
void set(const CacheKey & key, const CacheValue & val) noexcept
{
accessDirect(key,&val,1);
}
// thread-safe but slower version of set()
inline
void setThreadSafe(const CacheKey & key, const CacheValue & val) noexcept
{
std::lock_guard<std::mutex> lg(mut);
accessDirect(key,&val,1);
}
// use this before closing the backing-store to store the latest bits of data
void flush()
{
try
{
std::lock_guard<std::mutex> lg(mut);
for (size_t i=0;i<size;i++)
{
if (isEditedBuffer[i] == 1)
{
isEditedBuffer[i]=0;
auto oldKey = keyBuffer[i];
auto oldValue = valueBuffer[i];
saveData(oldKey,oldValue);
}
}
}catch(std::exception &ex){ std::cout<<ex.what()<<std::endl; }
}
// direct mapped access
// opType=0: get
// opType=1: set
CacheValue const accessDirect(const CacheKey & key,const CacheValue * value, const bool opType = 0)
{
// find tag mapped to the key
CacheKey tag = key & sizeM1;
// compare keys
if(keyBuffer[tag] == key)
{
// cache-hit
// "set"
if(opType == 1)
{
isEditedBuffer[tag]=1;
valueBuffer[tag]=*value;
}
// cache hit value
return valueBuffer[tag];
}
else // cache-miss
{
CacheValue oldValue = valueBuffer[tag];
CacheKey oldKey = keyBuffer[tag];
// eviction algorithm start
if(isEditedBuffer[tag] == 1)
{
// if it is "get"
if(opType==0)
{
isEditedBuffer[tag]=0;
}
saveData(oldKey,oldValue);
// "get"
if(opType==0)
{
const CacheValue && loadedData = loadData(key);
valueBuffer[tag]=loadedData;
keyBuffer[tag]=key;
return loadedData;
}
else /* "set" */
{
valueBuffer[tag]=*value;
keyBuffer[tag]=key;
return *value;
}
}
else // not edited
{
// "set"
if(opType == 1)
{
isEditedBuffer[tag]=1;
}
// "get"
if(opType == 0)
{
const CacheValue && loadedData = loadData(key);
valueBuffer[tag]=loadedData;
keyBuffer[tag]=key;
return loadedData;
}
else // "set"
{
valueBuffer[tag]=*value;
keyBuffer[tag]=key;
return *value;
}
}
}
}
private:
const CacheKey size;
const CacheKey sizeM1;
std::mutex mut;
std::vector<CacheValue> valueBuffer;
std::vector<unsigned char> isEditedBuffer;
std::vector<CacheKey> keyBuffer;
const std::function<CacheValue(CacheKey)> loadData;
const std::function<void(CacheKey,CacheValue)> saveData;
};
#endif /* DIRECTMAPPEDCACHE_H_ */
You can solve this problem using a Max-heap.
Insert the first k elements into the max-heap. The largest element of these k will now be at the root.
For each remaining element e:
Compare e to the root.
If e is larger than the root, discard it.
If e is smaller than the root, remove the root and insert e into the heap structure.
After all elements have been processed, the k-th smallest element is at the root.
This method uses O(K) space and O(n log n) time.
There’s an algorithm that people often call LazySelect that I think would be perfect here.
With high probability, we make two passes. In the first pass, we save a random sample of size n much less than N. The answer will be around index (K/N)n in the sorted sample, but due to the randomness, we have to be careful. Save the values a and b at (K/N)n ± r instead, where r is the radius of the window. In the second pass, we save all of the values in [a, b], count the number of values less than a (let it be L), and select the value with index K−L if it’s in the window (otherwise, try again).
The theoretical advice on choosing n and r is fine, but I would be pragmatic here. Choose n so that you use most of the available memory; the bigger the sample, the more informative it is. Choose r fairly large as well, but not quite as aggressively due to the randomness.
C++ code below. On the online judge, it’s faster than Kelly’s (max 1.3 seconds on the T=3 tests, 0.5 on the T=1 tests).
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iostream>
#include <limits>
#include <optional>
#include <random>
#include <vector>
namespace {
class LazySelector {
public:
static constexpr std::int32_t kTargetSampleSize = 1000;
explicit LazySelector() { sample_.reserve(1000000); }
void BeginFirstPass(const std::int32_t n, const std::int32_t k) {
sample_.clear();
mask_ = n / kTargetSampleSize;
mask_ |= mask_ >> 1;
mask_ |= mask_ >> 2;
mask_ |= mask_ >> 4;
mask_ |= mask_ >> 8;
mask_ |= mask_ >> 16;
}
void FirstPass(const std::int64_t value) {
if ((gen_() & mask_) == 0) {
sample_.push_back(value);
}
}
void BeginSecondPass(const std::int32_t n, const std::int32_t k) {
sample_.push_back(std::numeric_limits<std::int64_t>::min());
sample_.push_back(std::numeric_limits<std::int64_t>::max());
const double p = static_cast<double>(sample_.size()) / n;
const double radius = 2 * std::sqrt(sample_.size());
const auto lower =
sample_.begin() + std::clamp<std::int32_t>(std::floor(p * k - radius),
0, sample_.size() - 1);
const auto upper =
sample_.begin() + std::clamp<std::int32_t>(std::ceil(p * k + radius), 0,
sample_.size() - 1);
std::nth_element(sample_.begin(), upper, sample_.end());
std::nth_element(sample_.begin(), lower, upper);
lower_ = *lower;
upper_ = *upper;
sample_.clear();
less_than_lower_ = 0;
equal_to_lower_ = 0;
equal_to_upper_ = 0;
}
void SecondPass(const std::int64_t value) {
if (value < lower_) {
++less_than_lower_;
} else if (upper_ < value) {
} else if (value == lower_) {
++equal_to_lower_;
} else if (value == upper_) {
++equal_to_upper_;
} else {
sample_.push_back(value);
}
}
std::optional<std::int64_t> Select(std::int32_t k) {
if (k < less_than_lower_) {
return std::nullopt;
}
k -= less_than_lower_;
if (k < equal_to_lower_) {
return lower_;
}
k -= equal_to_lower_;
if (k < sample_.size()) {
const auto kth = sample_.begin() + k;
std::nth_element(sample_.begin(), kth, sample_.end());
return *kth;
}
k -= sample_.size();
if (k < equal_to_upper_) {
return upper_;
}
return std::nullopt;
}
private:
std::default_random_engine gen_;
std::vector<std::int64_t> sample_ = {};
std::int32_t mask_ = 0;
std::int64_t lower_ = std::numeric_limits<std::int64_t>::min();
std::int64_t upper_ = std::numeric_limits<std::int64_t>::max();
std::int32_t less_than_lower_ = 0;
std::int32_t equal_to_lower_ = 0;
std::int32_t equal_to_upper_ = 0;
};
} // namespace
int main() {
int t;
std::cin >> t;
for (int i = t; i > 0; --i) {
std::int32_t n;
std::int32_t k;
std::int64_t a;
std::int64_t b;
std::int64_t c;
std::int64_t d;
std::cin >> n >> k >> a >> b >> c >> d;
std::optional<std::int64_t> ans = std::nullopt;
LazySelector selector;
do {
{
selector.BeginFirstPass(n, k);
std::int64_t s = a;
for (std::int32_t j = n; j > 0; --j) {
selector.FirstPass(s);
s = (b * s + c) % d;
}
}
{
selector.BeginSecondPass(n, k);
std::int64_t s = a;
for (std::int32_t j = n; j > 0; --j) {
selector.SecondPass(s);
s = (b * s + c) % d;
}
}
ans = selector.Select(k - 1);
} while (!ans);
std::cout << *ans << '\n';
}
}
Related
I have a Time limit exceeded issue in problem 100 from UVa.
the question is here:
https://onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=24&page=show_problem&problem=36
Here is my code. Please help me find a solution. How can I avoid such problems?
I don't know if it is the problem with cin and cout or the while loops? this program works well in my terminal when I run it.
#include <iostream>
using namespace std;
int main()
{
int i , j, temp, n;
while (cin >> i >> j) //asking for user input
{
int x, y;
x = i;
y = j;
if (i > j) //sorting i and j to fix the order of numbers
{
temp = j;
j = i;
i = temp;
}
int answer = 0;
int counter;
while (i <= j)
{
n = i;
counter = 1; // make the value of counter to 1 because it increases if i is 1
while (1)
{
if(n == 1) { //if n = 1 then stop
break;
} else if (n % 2 == 0) //cheak if i is odd
{
n = (3 % n) + 1;
} else {
n = n / 2; //cheak if i is even
}
counter++; //increase by one for every number that is not 1
}
if (counter > answer)
{
answer = counter;
}
i++;
}
cout << x << " " << y << " " << answer << "\n";
}
return 0;
}
Thanks in advance
In my humble opinion this problem is not about calculating the resulting values using the given algorithm. Because of the simplicity this is just some noise. So,maybe we are talking about a XY Problem here.
Maybe I am wrong, but the main problem here seems to be memoization.
It maybe that values need to be calculated over and over again, because they are in some overlapped range. And this is not necessary.
So, we could memorize already calculated values, for example in a std::unordered_map (or std::map). So, something like in the below:
unsigned int getSteps(size_t index) noexcept {
unsigned counter{};
while (index != 1) {
if (index % 2) index = index * 3 + 1;
else index /= 2;
++counter;
}
return counter+1;
}
unsigned int getStepsMemo(size_t index) {
// Here we will memorize whatever we calculated before
static std::unordered_map<unsigned int, unsigned int> memo{};
// Resulting value
unsigned int result{};
// Look, if we did calculate the value in the past
auto iter = memo.find(index);
if (iter != memo.end())
// If yes, then reuse old value
result = iter->second;
else {
// If no, then calculate new and memorize it
result = getSteps(index);
memo[index] = result;
}
return result;
}
This will help with many given input pairs. It will avoid recalculating steps for already calculated values.
But having thought in this direction, we can also calculate all values at compile time and store them in a constexpr std::array. Then no calculation will be done during runtime. All steps for any number up to 10000 will be precalculated. So, the algorithm will never be called during runtime.
It should be clear that this is the fastest possible algorithm, because we do nothing. Just get the value from a lookup table.
And if we want to make things nice, then we pack everything in a class and let the class encapsulate the problem. Even input and output operatores will be overwritten and used for our own purposes.
And in the end, we will have an ultra fast one liner in our function main. Please see:
#include <iostream>
#include <utility>
#include <sstream>
#include <array>
#include <algorithm>
#include <iterator>
#include <unordered_map>
// All done during compile time -------------------------------------------------------------------
constexpr unsigned int getSteps(size_t index) noexcept {
unsigned counter{};
while (index != 1) {
if (index % 2) index = index * 3 + 1;
else index /= 2;
++counter;
}
return counter+1;
}
// Some helper to create a constexpr std::array initilized by a generator function
template <typename Generator, size_t ... Indices>
constexpr auto generateArrayHelper(Generator generator, std::index_sequence<Indices...>) {
return std::array<decltype(std::declval<Generator>()(size_t{})), sizeof...(Indices) > { generator(Indices+1)... };
}
template <size_t Size, typename Generator>
constexpr auto generateArray(Generator generator) {
return generateArrayHelper(generator, std::make_index_sequence<Size>());
}
constexpr size_t MaxIndex = 10000;
// This is the definition of a std::array<unsigned long long, 10000> with all step counts
constexpr auto steps = generateArray<MaxIndex>(getSteps);
// End of: All done during compile time -----------------------------------------------------------
// Some very simple helper class for easier handling of the functionality
struct StepsForPair {
// A pair with special functionality
unsigned int first{};
unsigned int second{};
// Simple extraction operator. Read 2 values
friend std::istream& operator >> (std::istream& is, StepsForPair& sfp) {
return is >> sfp.first >> sfp.second;
}
// Simple inserter. Sort first and second value and show result
friend std::ostream& operator << (std::ostream& os, const StepsForPair& sfp) {
unsigned int f{ sfp.first }, s{ sfp.second };
if (f > s) std::swap(f, s);
return os << sfp.first << ' ' << sfp.second << ' ' << *std::max_element(&steps[f], &steps[s]);
}
};
// Some test data. I will not use std::cin, but read from this std::istringstream here
std::istringstream testData{ R"(1 10
100 200
201 210
900 1000
22 22)" };
int main() {
// Read all input data and generate output
std::copy(std::istream_iterator<StepsForPair>(testData), {}, std::ostream_iterator<StepsForPair>(std::cout,"\n"));
}
Please note, since I do not have std::cin here on SO, I read the test values from a std::istringstream. Because of the overwritten extractor operator, this is easily possible.
If you want to read from std::cin then please replace in the std::copy statement in main "testData" eith "std::cin".
If you want to read from a file, then put a fileStream variable in there.
In this line n = (3 % n) + 1;, (3 % n) means that you take the remainder of 3 divided by n, which is probably not what you want. Change that to 3 * n
I'm trying to solve this problem:
Given an a×b rectangle, your task is to cut it into squares. On each move you can select a rectangle and cut it into two rectangles in such a way that all side lengths remain integers. What is the minimum possible number of moves?
My logic is that the minimum number of cuts means the minimum number of squares; I don't know if it's the correct approach.
I see which side is smaller, Now I know I need to cut bigSide/SmallSide of cuts to have squares of smallSide sides, then I am left with SmallSide and bigSide%smallSide. Then I go on till any side is 0 or both are equal.
#include <iostream>
int main() {
int a, b; std::cin >> a >> b; // sides of the rectangle
int res = 0;
while (a != 0 && b != 0) {
if (a > b) {
if (a % b == 0)
res += a / b - 1;
else
res += a / b;
a = a % b;
} else if (b > a) {
if (b % a == 0)
res += b / a - 1;
else
res += b / a;
b = b % a;
} else {
break;
}
}
std::cout << res;
return 0;
}
When the input is 404 288, my code gives 18, but the right answer is actually 10.
What am I doing wrong?
It seems clear to me that the problem defines each move as cutting a rectangle to two rectangles along the integer lines, and then asks for the minimum number of such cuts. As you can see there is a clear recursive nature in this problem. Once you cut a rectangle to two parts, you can recurse and cut each of them into squares with minimum moves and then sum up the answers. The problem is that the recursion might lead to exponential time complexity which leads us directly do dynamic programming. You have to use memoization to solve it efficiently (worst case time O(a*b*(a+b))) Here is what I'd suggest doing:
#include <iostream>
#include <vector>
using std::vector;
int min_cuts(int a, int b, vector<vector<int> > &mem) {
int min = mem[a][b];
// if already computed, just return the value
if (min > 0)
return min;
// if one side is divisible by the other,
// store min-cuts in 'min'
if (a%b==0)
min= a/b-1;
else if (b%a==0)
min= b/a -1;
// if there's no obvious solution, recurse
else {
// recurse on hight
for (int i=1; i<a/2; i++) {
int m = min_cuts(i,b, mem);
int n = min_cuts(a-i, b, mem);
if (min<0 or m+n+1<min)
min = m + n + 1;
}
// recurse on width
for (int j=1; j<b/2; j++) {
int m = min_cuts(a,j, mem);
int n = min_cuts(a, b-j, mem);
if (min<0 or m+n+1<min)
min = m + n + 1;
}
}
mem[a][b] = min;
return min;
}
int main() {
int a, b; std::cin >> a >> b; // sides of the rectangle
// -1 means the problem is not solved yet,
vector<vector<int> > mem(a+1, vector<int>(b+1, -1));
int res = min_cuts(a,b,mem);
std::cout << res << std::endl;
return 0;
}
The reason the foor loops go up until a/2 and b/2 is that cuting a paper is symmetric: if you cut along vertical line i it is the same as cutting along the line a-i if you flip the paper vertically. This is a little optimization hack that reduces complexity by a factor of 4 overall.
Another little hack is that by knowing that the problem is that if you transpose the paper the result is the same, meaining min_cuts(a,b)=min_cuts(b,a) you can potentially reduce computations by half. But any major further improvement, say a greedy algorithm would take more thinking (if there exists one at all).
The current answer is a good start, especially the suggestions to use memoization or dynamic programming, and potentially efficient enough.
Obviously, all answerers used the first with a sub-par data-structure. Vector-of-Vector has much space and performance overhead, using a (strict) lower triangular matrix stored in an array is much more efficient.
Using the maximum value as sentinel (easier with unsigned) would also reduce complexity.
Finally, let's move to dynamic programming instead of memoization to simplify and get even more efficient:
#include <algorithm>
#include <memory>
#include <utility>
constexpr unsigned min_cuts(unsigned a, unsigned b) {
if (a < b)
std::swap(a, b);
if (a == b || !b)
return 0;
const auto triangle = [](std::size_t n) { return n * (n - 1) / 2; };
const auto p = std::make_unique_for_overwrite<unsigned[]>(triangle(a));
/* const! */ unsigned zero = 0;
const auto f = [&](auto a, auto b) -> auto& {
if (a < b)
std::swap(a, b);
return a == b ? zero : p[triangle(a - 1) + b - 1];
};
for (auto i = 1u; i <= a; ++i) {
for (auto j = 1u; j < i; ++j) {
auto r = -1u;
for (auto k = i / 2; k; --k)
r = std::min(r, f(k, j) + f(i - k, j));
for (auto k = j / 2; k; --k)
r = std::min(r, f(k, i) + f(j - k, i));
f(i, j) = ++r;
}
}
return f(a, b);
}
I have a vector with digits of number, vector represents big integer in system with base 2^32. For example:
vector <unsigned> vec = {453860625, 469837947, 3503557200, 40}
This vector represent this big integer:
base = 2 ^ 32
3233755723588593872632005090577 = 40 * base ^ 3 + 3503557200 * base ^ 2 + 469837947 * base + 453860625
How to get this decimal representation in string?
Here is an inefficient way to do what you want, get a decimal string from a vector of word values representing an integer of arbitrary size.
I would have preferred to implement this as a class, for better encapsulation and so math operators could be added, but to better comply with the question, this is just a bunch of free functions for manipulating std::vector<unsigned> objects. This does use a typedef BiType as an alias for std::vector<unsigned> however.
Functions for doing the binary division make up most of this code. Much of it duplicates what can be done with std::bitset, but for bitsets of arbitrary size, as vectors of unsigned words. If you want to improve efficiency, plug in a division algorithm which does per-word operations, instead of per-bit. Also, the division code is general-purpose, when it is only ever used to divide by 10, so you could replace it with special-purpose division code.
The code generally assumes a vector of unsigned words and also that the base is the maximum unsigned value, plus one. I left a comment wherever things would go wrong for smaller bases or bases which are not a power of 2 (binary division requires base to be a power of 2).
Also, I only tested for 1 case, the one you gave in the OP -- and this is new, unverified code, so you might want to do some more testing. If you find a problem case, I'll be happy to fix the bug here.
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
namespace bigint {
using BiType = std::vector<unsigned>;
// cmp compares a with b, returning 1:a>b, 0:a==b, -1:a<b
int cmp(const BiType& a, const BiType& b) {
const auto max_size = std::max(a.size(), b.size());
for(auto i=max_size-1; i+1; --i) {
const auto wa = i < a.size() ? a[i] : 0;
const auto wb = i < b.size() ? b[i] : 0;
if(wa != wb) { return wa > wb ? 1 : -1; }
}
return 0;
}
bool is_zero(BiType& bi) {
for(auto w : bi) { if(w) return false; }
return true;
}
// canonize removes leading zero words
void canonize(BiType& bi) {
const auto size = bi.size();
if(!size || bi[size-1]) return;
for(auto i=size-2; i+1; --i) {
if(bi[i]) {
bi.resize(i + 1);
return;
}
}
bi.clear();
}
// subfrom subtracts b from a, modifying a
// a >= b must be guaranteed by caller
void subfrom(BiType& a, const BiType& b) {
unsigned borrow = 0;
for(std::size_t i=0; i<b.size(); ++i) {
if(b[i] || borrow) {
// TODO: handle error if i >= a.size()
const auto w = a[i] - b[i] - borrow;
// this relies on the automatic w = w (mod base),
// assuming unsigned max is base-1
// if this is not the case, w must be set to w % base here
borrow = w >= a[i];
a[i] = w;
}
}
for(auto i=b.size(); borrow; ++i) {
// TODO: handle error if i >= a.size()
borrow = !a[i];
--a[i];
// a[i] must be set modulo base here too
// (this is automatic when base is unsigned max + 1)
}
}
// binary division and its helpers: these require base to be a power of 2
// hi_bit_set is base/2
// the definition assumes CHAR_BIT == 8
const auto hi_bit_set = unsigned(1) << (sizeof(unsigned) * 8 - 1);
// shift_right_1 divides bi by 2, truncating any fraction
void shift_right_1(BiType& bi) {
unsigned carry = 0;
for(auto i=bi.size()-1; i+1; --i) {
const auto next_carry = (bi[i] & 1) ? hi_bit_set : 0;
bi[i] >>= 1;
bi[i] |= carry;
carry = next_carry;
}
// if carry is nonzero here, 1/2 was truncated from the result
canonize(bi);
}
// shift_left_1 multiplies bi by 2
void shift_left_1(BiType& bi) {
unsigned carry = 0;
for(std::size_t i=0; i<bi.size(); ++i) {
const unsigned next_carry = !!(bi[i] & hi_bit_set);
bi[i] <<= 1; // assumes high bit is lost, i.e. base is unsigned max + 1
bi[i] |= carry;
carry = next_carry;
}
if(carry) { bi.push_back(1); }
}
// sets an indexed bit in bi, growing the vector when required
void set_bit_at(BiType& bi, std::size_t index, bool set=true) {
std::size_t widx = index / (sizeof(unsigned) * 8);
std::size_t bidx = index % (sizeof(unsigned) * 8);
if(bi.size() < widx + 1) { bi.resize(widx + 1); }
if(set) { bi[widx] |= unsigned(1) << bidx; }
else { bi[widx] &= ~(unsigned(1) << bidx); }
}
// divide divides n by d, returning the result and leaving the remainder in n
// this is implemented using binary division
BiType divide(BiType& n, BiType d) {
if(is_zero(d)) {
// TODO: handle divide by zero
return {};
}
std::size_t shift = 0;
while(cmp(n, d) == 1) {
shift_left_1(d);
++shift;
}
BiType result;
do {
if(cmp(n, d) >= 0) {
set_bit_at(result, shift);
subfrom(n, d);
}
shift_right_1(d);
} while(shift--);
canonize(result);
canonize(n);
return result;
}
std::string get_decimal(BiType bi) {
std::string dec_string;
// repeat division by 10, using the remainder as a decimal digit
// this will build a string with digits in reverse order, so
// before returning, it will be reversed to correct this.
do {
const auto next_bi = divide(bi, {10});
const char digit_value = static_cast<char>(bi.size() ? bi[0] : 0);
dec_string.push_back('0' + digit_value);
bi = next_bi;
} while(!is_zero(bi));
std::reverse(dec_string.begin(), dec_string.end());
return dec_string;
}
}
int main() {
bigint::BiType my_big_int = {453860625, 469837947, 3503557200, 40};
auto dec_string = bigint::get_decimal(my_big_int);
std::cout << dec_string << '\n';
}
Output:
3233755723588593872632005090577
I want to find (n choose r) for large integers, and I also have to find out the mod of that number.
long long int choose(int a,int b)
{
if (b > a)
return (-1);
if(b==0 || a==1 || b==a)
return(1);
else
{
long long int r = ((choose(a-1,b))%10000007+(choose(a-1,b- 1))%10000007)%10000007;
return r;
}
}
I am using this piece of code, but I am getting TLE. If there is some other method to do that please tell me.
I don't have the reputation to comment yet, but I wanted to point out that the answer by rock321987 works pretty well:
It is fast and correct up to and including C(62, 31)
but cannot handle all inputs that have an output that fits in a uint64_t. As proof, try:
C(67, 33) = 14,226,520,737,620,288,370 (verify correctness and size)
Unfortunately, the other implementation spits out 8,829,174,638,479,413 which is incorrect. There are other ways to calculate nCr which won't break like this, however the real problem here is that there is no attempt to take advantage of the modulus.
Notice that p = 10000007 is prime, which allows us to leverage the fact that all integers have an inverse mod p, and that inverse is unique. Furthermore, we can find that inverse quite quickly. Another question has an answer on how to do that here, which I've replicated below.
This is handy since:
x/y mod p == x*(y inverse) mod p; and
xy mod p == (x mod p)(y mod p)
Modifying the other code a bit, and generalizing the problem we have the following:
#include <iostream>
#include <assert.h>
// p MUST be prime and less than 2^63
uint64_t inverseModp(uint64_t a, uint64_t p) {
assert(p < (1ull << 63));
assert(a < p);
assert(a != 0);
uint64_t ex = p-2, result = 1;
while (ex > 0) {
if (ex % 2 == 1) {
result = (result*a) % p;
}
a = (a*a) % p;
ex /= 2;
}
return result;
}
// p MUST be prime
uint32_t nCrModp(uint32_t n, uint32_t r, uint32_t p)
{
assert(r <= n);
if (r > n-r) r = n-r;
if (r == 0) return 1;
if(n/p - (n-r)/p > r/p) return 0;
uint64_t result = 1; //intermediary results may overflow 32 bits
for (uint32_t i = n, x = 1; i > r; --i, ++x) {
if( i % p != 0) {
result *= i % p;
result %= p;
}
if( x % p != 0) {
result *= inverseModp(x % p, p);
result %= p;
}
}
return result;
}
int main() {
uint32_t smallPrime = 17;
uint32_t medNum = 3001;
uint32_t halfMedNum = medNum >> 1;
std::cout << nCrModp(medNum, halfMedNum, smallPrime) << std::endl;
uint32_t bigPrime = 4294967291ul; // 2^32-5 is largest prime < 2^32
uint32_t bigNum = 1ul << 24;
uint32_t halfBigNum = bigNum >> 1;
std::cout << nCrModp(bigNum, halfBigNum, bigPrime) << std::endl;
}
Which should produce results for any set of 32-bit inputs if you are willing to wait. To prove a point, I've included the calculation for a 24-bit n, and the maximum 32-bit prime. My modest PC took ~13 seconds to calculate this. Check the answer against wolfram alpha, but beware that it may exceed the 'standard computation time' there.
There is still room for improvement if p is much smaller than (n-r) where r <= n-r. For example, we could precalculate all the inverses mod p instead of doing it on demand several times over.
nCr = n! / (r! * (n-r)!) {! = factorial}
now choose r or n - r in such a way that any of them is minimum
#include <cstdio>
#include <cmath>
#define MOD 10000007
int main()
{
int n, r, i, x = 1;
long long int res = 1;
scanf("%d%d", &n, &r);
int mini = fmin(r, (n - r));//minimum of r,n-r
for (i = n;i > mini;i--) {
res = (res * i) / x;
x++;
}
printf("%lld\n", res % MOD);
return 0;
}
it will work for most cases as required by programming competitions if the value of n and r are not too high
Time complexity :- O(min(r, n - r))
Limitation :- for languages like C/C++ etc. there will be overflow if
n > 60 (approximately)
as no datatype can store the final value..
The expansion of nCr can always be reduced to product of integers. This is done by canceling out terms in denominator. This approach is applied in the function given below.
This function has time complexity of O(n^2 * log(n)). This will calculate nCr % m for n<=10000 under 1 sec.
#include <numeric>
#include <algorithm>
int M=1e7+7;
int ncr(int n, int r)
{
r=min(r,n-r);
int A[r],i,j,B[r];
iota(A,A+r,n-r+1); //initializing A starting from n-r+1 to n
iota(B,B+r,1); //initializing B starting from 1 to r
int g;
for(i=0;i<r;i++)
for(j=0;j<r;j++)
{
if(B[i]==1)
break;
g=__gcd(B[i], A[j] );
A[j]/=g;
B[i]/=g;
}
long long ans=1;
for(i=0;i<r;i++)
ans=(ans*A[i])%M;
return ans;
}
How to check if the binary representation of an integer is a palindrome?
Hopefully correct:
_Bool is_palindrome(unsigned n)
{
unsigned m = 0;
for(unsigned tmp = n; tmp; tmp >>= 1)
m = (m << 1) | (tmp & 1);
return m == n;
}
Since you haven't specified a language in which to do it, here's some C code (not the most efficient implementation, but it should illustrate the point):
/* flip n */
unsigned int flip(unsigned int n)
{
int i, newInt = 0;
for (i=0; i<WORDSIZE; ++i)
{
newInt += (n & 0x0001);
newInt <<= 1;
n >>= 1;
}
return newInt;
}
bool isPalindrome(int n)
{
int flipped = flip(n);
/* shift to remove trailing zeroes */
while (!(flipped & 0x0001))
flipped >>= 1;
return n == flipped;
}
EDIT fixed for your 10001 thing.
Create a 256 lines chart containing a char and it's bit reversed char.
given a 4 byte integer,
take the first char, look it on the chart, compare the answer to the last char of the integer.
if they differ it is not palindrome, if the are the same repeat with the middle chars.
if they differ it is not palindrome else it is.
Plenty of nice solutions here. Let me add one that is not the most efficient, but very readable, in my opinion:
/* Reverses the digits of num assuming the given base. */
uint64_t
reverse_base(uint64_t num, uint8_t base)
{
uint64_t rev = num % base;
for (; num /= base; rev = rev * base + num % base);
return rev;
}
/* Tells whether num is palindrome in the given base. */
bool
is_palindrome_base(uint64_t num, uint8_t base)
{
/* A palindrome is equal to its reverse. */
return num == reverse_base(num, base);
}
/* Tells whether num is a binary palindrome. */
bool
is_palindrome_bin(uint64_t num)
{
/* A binary palindrome is a palindrome in base 2. */
return is_palindrome_base(num, 2);
}
The following should be adaptable to any unsigned type. (Bit operations on signed types tend to be fraught with problems.)
bool test_pal(unsigned n)
{
unsigned t = 0;
for(unsigned bit = 1; bit && bit <= n; bit <<= 1)
t = (t << 1) | !!(n & bit);
return t == n;
}
int palidrome (int num)
{
int rev = 0;
num = number;
while (num != 0)
{
rev = (rev << 1) | (num & 1); num >> 1;
}
if (rev = number) return 1; else return 0;
}
I always have a palindrome function that works with Strings, that returns true if it is, false otherwise, e.g. in Java. The only thing I need to do is something like:
int number = 245;
String test = Integer.toString(number, 2);
if(isPalindrome(test)){
...
}
A generic version:
#include <iostream>
#include <limits>
using namespace std;
template <class T>
bool ispalindrome(T x) {
size_t f = 0, l = (CHAR_BIT * sizeof x) - 1;
// strip leading zeros
while (!(x & (1 << l))) l--;
for (; f != l; ++f, --l) {
bool left = (x & (1 << f)) > 0;
bool right = (x & (1 << l)) > 0;
//cout << left << '\n';
//cout << right << '\n';
if (left != right) break;
}
return f != l;
}
int main() {
cout << ispalindrome(17) << "\n";
}
I think the best approach is to start at the ends and work your way inward, i.e. compare the first bit and the last bit, the second bit and the second to last bit, etc, which will have O(N/2) where N is the size of the int. If at any point your pairs aren't the same, it isn't a palindrome.
bool IsPalindrome(int n) {
bool palindrome = true;
size_t len = sizeof(n) * 8;
for (int i = 0; i < len / 2; i++) {
bool left_bit = !!(n & (1 << len - i - 1));
bool right_bit = !!(n & (1 << i));
if (left_bit != right_bit) {
palindrome = false;
break;
}
}
return palindrome;
}
Sometimes it's good to report a failure too;
There are lots of great answers here about the obvious way to do it, by analyzing in some form or other the bit pattern. I got to wondering, though, if there were any mathematical solutions? Are there properties of palendromic numbers that we might take advantage of?
So I played with the math a little bit, but the answer should really have been obvious from the start. It's trivial to prove that all binary palindromic numbers must be either odd or zero. That's about as far as I was able to get with it.
A little research showed no such approach for decimal palindromes, so it's either a very difficult problem or not solvable via a formal system. It might be interesting to prove the latter...
public static bool IsPalindrome(int n) {
for (int i = 0; i < 16; i++) {
if (((n >> i) & 1) != ((n >> (31 - i)) & 1)) {
return false;
}
}
return true;
}
bool PaLInt (unsigned int i, unsigned int bits)
{
unsigned int t = i;
unsigned int x = 0;
while(i)
{
x = x << bits;
x = x | (i & ((1<<bits) - 1));
i = i >> bits;
}
return x == t;
}
Call PalInt(i,1) for binary pallindromes
Call PalInt(i,3) for Octal Palindromes
Call PalInt(i,4) for Hex Palindromes
I know that this question has been posted 2 years ago, but I have a better solution which doesn't depend on the word size and all,
int temp = 0;
int i = num;
while (1)
{ // let's say num is the number which has to be checked
if (i & 0x1)
{
temp = temp + 1;
}
i = i >> 1;
if (i) {
temp = temp << 1;
}
else
{
break;
}
}
return temp == num;
In JAVA there is an easy way if you understand basic binary airthmetic, here is the code:
public static void main(String []args){
Integer num=73;
String bin=getBinary(num);
String revBin=reverse(bin);
Integer revNum=getInteger(revBin);
System.out.println("Is Palindrome: "+((num^revNum)==0));
}
static String getBinary(int c){
return Integer.toBinaryString(c);
}
static Integer getInteger(String c){
return Integer.parseInt(c,2);
}
static String reverse(String c){
return new StringBuilder(c).reverse().toString();
}
#include <iostream>
#include <math.h>
using namespace std;
int main()
{
unsigned int n = 134217729;
unsigned int bits = floor(log(n)/log(2)+1);
cout<< "Number of bits:" << bits << endl;
unsigned int i=0;
bool isPal = true;
while(i<(bits/2))
{
if(((n & (unsigned int)pow(2,bits-i-1)) && (n & (unsigned int)pow(2,i)))
||
(!(n & (unsigned int)pow(2,bits-i-1)) && !(n & (unsigned int)pow(2,i))))
{
i++;
continue;
}
else
{
cout<<"Not a palindrome" << endl;
isPal = false;
break;
}
}
if(isPal)
cout<<"Number is binary palindrome" << endl;
}
The solution below works in python:
def CheckBinPal(b):
b=str(bin(b))
if b[2:]==b[:1:-1]:
return True
else:
return False
where b is the integer
If you're using Clang, you can make use of some __builtins.
bool binaryPalindrome(const uint32_t n) {
return n == __builtin_bitreverse32(n << __builtin_clz(n));
}
One thing to note is that __builtin_clz(0) is undefined so you'll need to check for zero. If you're compiling on ARM using Clang (next generation mac), then this makes use of the assembly instructions for reverse and clz (compiler explorer).
clz w8, w0
lsl w8, w0, w8
rbit w8, w8
cmp w8, w0
cset w0, eq
ret
x86 has instructions for clz (sort of) but not reversing. Still, Clang will emit the fastest code possible for reversing on the target architecture.
Javascript Solution
function isPalindrome(num) {
const binaryNum = num.toString(2);
console.log(binaryNum)
for(let i=0, j=binaryNum.length-1; i<=j; i++, j--) {
if(binaryNum[i]!==binaryNum[j]) return false;
}
return true;
}
console.log(isPalindrome(0))