This is actually a SPOJ problem: WAYS
Now this is a very easy task what we need to do is to compute the Central binomial coefficients.
However the problem setter includes a very notorious source limit of 120 bytes, so my question is how to get past that source code limit in the languages that are allowed?
Assuming, that C(2n,n) = (2n)!/(n!)^2 = (2n(2n-1)/n^2) * C(2(n-1),n-1) = ((4n-2)/n)*C(2(n-1),n-1) here is function, which calculates central binomial:
int f(int n)
return n==1? 2 : f(n-1)*(4*n-2)/n;
Edit: Here is probably shortest code:
int f(int n){return n<2?2:f(n-1)*(4*n-2)/n;}
It is only 44 characters.
I haven't tried writing the code, but since the value of m is only 14, you could submit a table. Not sure if the code can be made shorter than this.
I am solving a LeetCode problem Search in Rotated Sorted Array, in order to learn Binary Search better. The problem statement is:
There is an integer array nums sorted in ascending order (with distinct values). Prior to being passed to your function, nums is possibly rotated at an unknown pivot index. For example, [0,1,2,4,5,6,7] might be rotated at pivot index 3 and become [4,5,6,7,0,1,2]. Given the array nums after the possible rotation and an integer target, return the index of target if it is in nums, or -1 if it is not in nums.
With some online help, I came up with the solution below, which I mostly understand:
class Solution {
int search(vector<int>& nums, int target) {
int l=0, r=nums.size()-1;
while(l<r) { // 1st loop; how is BS applicable here, since array is NOT sorted?
int m=l+(r-l)/2;
if(nums[m]>nums[r]) l=m+1;
else r=m;
// cout<<"Lowest at: "<<r<<"\n";
if(nums[r]==target) return r; //target==lowest number
int start, end;
if(target<=nums[nums.size()-1]) {
} else {
l=start, r=end;
while(l<r) {
int m=l+(r-l)/2;
if(nums[m]==target) return m;
if(nums[m]>target) r=m;
else l=m+1;
return nums[l]==target ? l : -1;
My question: Are we searching over a parabola in the first while loop, trying to find the lowest point of a parabola, unlike a linear array in traditional binary search? Are we finding the minimum of a convex function? I understand how the values of l, m and r change leading to the right answer - but I do not fully follow how we can be guaranteed that if(nums[m]>nums[r]), our lowest value would be on the right.
You actually skipped something important by “getting help”.
Once, when I was struggling to integrate something tricky for Calculus Ⅰ, I went for help and the advisor said, “Oh, I know how to do this” and solved it. I learned nothing from him. It took me another week of going over it (and other problems) myself to understand it sufficient that I could do it myself.
The purpose of these assignments is to solve the problem yourself. Even if your solution is faulty, you have learned more than simply reading and understanding the basics of one example problem someone else has solved.
In this particular case...
Since you already have a solution, let’s take a look at it: Notice that it contains two binary search loops. Why?
As you observed at the beginning, the offset shift makes the array discontinuous (not convex). However, the subarrays either side of the discontinuity remain monotonic.
Take a moment to convince yourself that this is true.
Knowing this, what would be a good way to find and determine which of the two subarrays to search?
A binary search as ( n ⟶ ∞ ) is O(log n)
O(log n) ≡ O(2 log n)
I should also observe to you that the prompt gives as example an arithmetic progression with a common difference of 1, but the prompt itself imposes no such restriction. All it says is that you start with a strictly increasing sequence (no duplicate values). You could have as input [19 74 512 513 3 7 12].
Does the supplied solution handle this possibility?
Why or why not?
I am trying to improve the speed of a computational (biological) model written in C++ (previous version is on my github: Prokaryotes). The most time-consuming function is where I calculate binding affinities between transcription factors and binding sites on a single genome.
Background: In my model, binding affinity is given by the Hamming distance between the binding domain of a transcription factor (a 20-bool array) and the sequence of a binding site (also a 20-bool array). For a single genome, I need to calculate the affinities between all active transcription factors (typically 5-10) and all binding sites (typically 10-50). I do this every timestep for more than 10,000 cells in the population to update their gene expression states. Having to calculate up to half a million comparisons of 20-bool arrays to simulate just one timestep of my model means that typical experiments take several months (2M--10M timesteps).
For the previous version of the model (link above) genomes remained fairly small, so I could calculate binding affinities once for every cell (at birth) and store and re-use these numbers during the cell's lifetime. However, in the latest version, genomes expand considerably and multiple genomes reside within the same cell. Thus, storing affinities of all transcript factor--binding site pairs in a cell becomes impractical.
In the current implementation I defined an inline function belonging to the Bead class (which is a base class for transcription factor class "Regulator" and binding site class "Bsite"). It is written directly in the header file Bead.hh:
inline int Bead::BindingAffinity(const bool* sequenceA, const bool* sequenceB, int seqlen) const
int affinity = 0;
for (int i=0; i<seqlen; i++)
affinity += (int)(sequenceA[i]^sequenceB[i]);
return affinity;
The above function accepts two pointers to boolean arrays (sequenceA and sequenceB), and an integer specifying their length (seqlen). Using a simple for-loop I then check at how many positions the arrays differ (sequenceA[i]^sequenceB[i]), summing into the variable affinity.
Given a binding site (bsite) on the genome, we can then iterate through the genome and for every transcription factor (reg) calculate its affinity to this particular binding site like so:
affinity = (double)reg->BindingAffinity(bsite->sequence, reg->sequence);
So, this is how streamlined I managed to make it; since I don't have a programming background, I wonder whether there are better ways to write the above function or to structure the code (i.e. should BindingAffinity be a function of the base Bead class)? Suggestions are greatly appreciated.
Thanks to #PaulMcKenzie and #eike for your suggestions. I tested both ideas against my previous implementation. Below are the results. In short, both answers work very well.
My previous implementation yielded an average runtime of 5m40 +/- 7 (n=3) for 1000 timesteps of the model. Profiling analysis with GPROF showed that the function BindingAffinity() took 24.3% of total runtime. [see Question for the code].
The bitset implementation yielded an average runtime of 5m11 +/- 6 (n=3), corresponding to a ~9% speed increase. Only 3.5% of total runtime is spent in BindingAffinity().
//Function definition in Bead.hh
inline int Bead::BindingAffinity(std::bitset<regulator_length> sequenceA, const std::bitset<regulator_length>& sequenceB) const
return (int)(sequenceA ^= sequenceB).count();
//Function call in
affinity = (double)reg->BindingAffinity(bsite->sequence, reg->sequence);
The main downside of the bitset implementation is that unlike with boolean arrays (my previous implementation), I have to specify the length of the bitset that goes into the function. I am occasionally comparing bitsets of different lengths, so for these I now have to specify separate functions (templates would not work for multi-file project according to
For the integer implementation I tried two alternatives to the std::popcount(seq1^seq2) function suggested by #eike since I am working with an older version of C++ that doesn't include this.
Alternative #1:
inline int Bead::BindingAffinity(int sequenceA, int sequenceB) const
int i = sequenceA^sequenceB;
std::bitset<32> bi (i);
return ((std::bitset<32>)i).count();
Alternative #2:
inline int Bead::BindingAffinity(int sequenceA, int sequenceB) const
int i = sequenceA^sequenceB;
//SWAR algorithm, copied from
i = i - ((i >> 1) & 0x55555555); // add pairs of bits
i = (i & 0x33333333) + ((i >> 2) & 0x33333333); // quads
i = (i + (i >> 4)) & 0x0F0F0F0F; // groups of 8
return (i * 0x01010101) >> 24; // horizontal sum of bytes
These yielded average runtimes of 5m06 +/- 6 (n=3) and 5m06 +/- 3 (n=3), respectively, corresponding to a ~10% speed increase compared to my previous implementation. I only profiled Alternative #2, which showed that only 2.2% of total runtime was spent in BindingAffinity(). The downside of using integers for bitstrings is that I have to be very careful whenever I change any of the code. Single-bit mutations are definitely possible as mentioned by #eike, but everything is just a little bit trickier.
Both the bitset and integer implementations for comparing bitstrings achieve impressive speed improvements. So much so, that BindingAffinity() is no longer the bottleneck in my code.
I am trying to find one element in one array, which has the minimum absolute value. For example, in array [5.1, -2.2, 8.2, -1, 4, 3, -5, 6], I want get the value -1. I use following code (myarray is 1D array and not sorted)
for (int i = 1; i < 8; ++i)
if(fabsf(myarray[i])<fabsf(myarray[0])) myarray[0] = myarray[i];
Then, the target value is in myarray[0].
Because I have to repeat this procedure many times, this piece of code becomes the bottleneck in my program. Does anyone know how to improve this code? Thanks in advance!
BTW, the size of the array is always eight. Could this be used to optimize this code?
Update: so far, following code works slightly better on my machine:
float absMin = fabsf(myarray[0]); int index = 0;
for (int i = 1; i < 8; ++i)
if(fabsf(myarray[i])<absMin) {absMin = fabsf(myarray[i]); index=i;}
float result = myarray[index];
I am wandering how to avoid fabsf, because I just want to compare the absolute values instead of computing them. Does anyone have any idea?
There are some urban myths like inlining, loop unrolling by hand and similar which are supposed to make your code faster. Good news is you don't have to do it, at least if you use -O3 compiler optimization.
Bad news is, if you already use -O3 there is nothing you can do to speed up this function: the compiler will optimize the hell out of your code! For example it will surely do the caching of fabsf(myarray[0]) as some suggested. The only thing you can achieve with this "refactoring" is to build bugs into your program and make it less readable.
My advice is to look somewhere else for improvements:
try to reduce the number of invocations of this code
if this code is the bottle neck, than my guess would be that you recalculate the minimal value over and over again (otherwise filling the values into the array would take approximately the same time) - so cache the results of the search
shift costs to changing the elements of the array, for example by using some fancy data structures (heaps, priority_queue) or by tracking the minimum of elements. Lets say your array has only two elements values [1,2] so minimum is 1. Now if you change
2 to 3, you don't have to do anything
2 to 0, you can easily update your minimum to 0
1 to 3, you have to loop through all elements. But maybe this case is not that often.
Can you store the values pre fabbed?
Also as #Gerstrong mentions, storing the number outside the loop and only calculating it when array changes will give you a boost.
Calling partial_sort or nth_element will sort the array only so that the correct value is in the right location.
std::nth_element(v.begin(), v.begin(), v.end(), [](float& lhs, float& rhs){
return fabsf(lhs)<fabsf(rhs);
Let me give some ideas that could help:
float minVal = fabsf(myarray[0]);
for (int i = 1; i < 8; ++i)
if(fabsf(myarray[i])<minVal) minVal = fabsf(myarray[i]);
myarray[0] = minVal;
But compilers nowadays are very smart and you might not get any more speed, as you already get optimized code. It depends on how your mentioned piece of code is called.
Another way to optimize this maybe is using C++ and STL, so you can do the following using the typical binary search tree std::set:
// Absolute comparator for std::set
bool absless_compare(const int64_t &a, const int64_t &b)
return (fabsf(a) < fabsf(b));
std::set<float, absless_compare> mySet = {5.1, -2.2, 8.2, -1, 4, 3, -5, 6};
const float minVal = *(mySet.begin());
With this approach by inserting your numbers they are already sorted in ascending order. The less-Comparator is usually a set for the std::set, but you can change it to use something different like in this example. This might help on larger datasets, but you mentioned you only have eight values to compare, so it really will not help.
Eight elements is a very small number, which might be kept in stack with for example the declaration of std::array<float,8> myarray close to your sorting function before filling it with data. You should that variants on your full codeset and observe what helps. Of course if you declare std::array<float,8> myarray or float[8] myarray runtime you should get the same results.
What you also could check is if fabsf really uses float as parameter and does not convert your variable to double which would degrade the performance. There is also std::abs() which for my understanding deduces the data type, because in C++ you can use templates etc.
If don't want to use fabs obviously a call like this
float myAbs(const float val)
return (val<0) ? -val : val;
or you hack the bit to zero which make your number negative. Either way, I'm pretty sure, that fabsf is fully aware of that, and I don't think a code like that will make it faster.
So I would check if the argument is converted to double. If you have C99 Standard in your system though, you should not have that issue.
One thought would be to do your comparisons "tournament" style, instead of linearly. In other words, you first compare 1 with 2, 3 with 4, etc. Then you take those 4 elements and do the same thing, and then again, until you only have one element left.
This does not change the number of comparisons. Since each comparison eliminates one element from the running, you will have exactly 7 comparisons no matter what. So why do I suggest this? Because it removes data dependencies from your code. Modern processors have multiple pipelines and can retire multiple instructions simultaneously. However, when you do the comparisons in a loop, each loop iteration depends on the previous one. When you do it tournament style, the first four comparisons are completely independent, so the processor may be able to do them all at once.
In addition to doing that, you can compute all the fabs at once in a trivial loop and put it in a new array. Since the fabs computations are independent, this can get sped up pretty easily. You would do this first, and then the tournament style comparisons to get the index. It should be exactly the same number of operations, it's just changing the order around so that the compiler can more easily see larger blocks that lack data dependencies.
The element of an array with minimal absolute value
Let the array, A
A = [5.1, -2.2, 8.2, -1, 4, 3, -5, 6]
The minimal absolute value of A is,
double miniAbsValue = A.array().abs().minCoeff();
int i_minimum = 0; // to find the position of minimum absolute value
for(int i = 0; i < 8; i++)
double ftn = evalsH(i);
if( fabs(ftn) == miniAbsValue )
i_minimum = i;
Now the element of A with minimal absolute value is
I'm writing a function for calculating integrals recursively, using the trapezoid rule. For some f(x) on the interval (a,b), the method is to calculate the area of the big trapezoid with side (b-a) and then compare it with the sum of small trapezoids formed after dividing the interval into n parts. If the difference is larger than some given error, the function is called again for each small trapezoid and the results summed. If the difference is smaller, it returns the arithmetic mean of the two values.
The function takes two parameters, a function pointer to the function which is to be integrated and a constant reference to an auxiliary structure, which contains information such as the interval (a,b), the amount of partitions, etc:
struct Config{
double min,max;
int partitions;
double precision;
The problem arises when I want to change the amount of partitions with each iteration, for the moment let's say just increment by one. I see no way of doing this without resorting to calling the current depth of the recurrence:
integrate(const Config &conf, funptr f){
double a=conf.min,b=conf.max;
int n=conf.partitions;
//calculating the trapezoid areas here
if(std::abs(bigTrapezoid-sumOfSmallTrapezoids) > conf.precision){
double s=0.;
Config configs = new Config[n];
int newpartitions = n+(calls);
for(int i=0; i < n;++i){
configs[i]={ a+i*(b-a)/n , a+(i+1)*(b-a)/n , newpartitions};
delete [] configs;
return s; }
return 0.5*(bigTrapezoid+sumOfSmallTrapezoids);}
The part I'm missing here is of course a way to find (calls). I have tried doing something similar to this answer, but it does not work, in fact it freezes the pc until makefile kills the process. But perhaps I'm doing it wrong. I do not want to add an extra parameter to the function or an additional variable to the structure. How should I proceed?
You cannot "find" calls, but you can definitely pass it yourself, like this:
integrate(const Config &conf, funptr f, int calls=0) {
s+=integrate(configs[i],f, calls+1);
It seems to me that 'int newpartitions = n + 1;' would be enough, no? At every recursion level, the number of partitions increases by one. Say conf.partitions starts off at 1. If the routine needs to recurse down a new level, newpartitions is 2, and you will build 2 new Config instances each with '2' as the value for partitions. Recursing down another level, newpartitions is 3, and you build 3 Configs, each with '3' as 'partitions', and so on.
The trick here is to make sure your code is robust enough to avoid infinite recursion.
By the way, it seems inefficient to me to use dynamic allocation for Config instances that have to be destroyed after the loop. Why not build a single Config instance on the stack inside the loop? Your code should run much faster that way.
This is the only question on my final review that I'm still uncertain about. I've figured all of the other 74 out, but this one is completely stumping me. I think it has something to do with finding C and k, but I don't remember how to do this or what it even means... and I may not even be on the right track there.
The question I'm encountering is "What is the minimum acceptable value for N such that the definition for O(f(N)) is satisfied for member function Heap::Insert(int v)?"
The code for Heap::Insert(int v) is as follows:
void Insert(int v)
if (IsFull()) return;
int p=++count;
while (H[p/2] > v) {
H[p] = H[p/2];
p/= 2;
H[p] = v;
The possible answers given are: 32, 64, 128, 256.
I'm completely stumped and have to take this exam in the morning. Help would be immensely appreciated.
I admit the question is quite obscure, but I will try to give a reasonable explanation.
If we call f(N) the temporal complexity of the operation executed by your code as a function of the number of elements in the heap, the professor wanted you to remember that f(N) = O(log(N)) for a binary heap insert, that is O(h), where h is the height of the heap and we assume it to be complete (remember how a heap works and that it can be represented as a binary tree). Thus, you have to try those four values of Nmin and find the smallest one that satisfies the definition, i.e. the one for which
f(n) <= k*log(N)
For each N >= Nmin and at least a k. I would give you the details for calculating f(N) if only your code did what the professor or you expected it to do.
