Related
let say I have a total number
tN = 12
and a set of elements
elem = [1,2,3,4]
and a prob for each element to be taken
prob = [0.0, 0.5, 0.75, 0.25]
i need to get a random multiset of these elements, such as
the taken elements reflects the prob
the sum of each elem is tN
with the example above, here's some possible outcome:
3 3 2 4
2 3 2 3 2
3 4 2 3
2 2 3 3 2
3 2 3 2 2
at the moment, maxtN will be 64, and elements the one above (1,2,3,4).
is this a Knapsack problem? how would you easily resolve it? both "on the fly" or "pre-calculate" approch will be allowed (or at least, depends by the computation time). I'm doing it for a c++ app.
Mission: don't need to have exactly the % in the final seq. Just to give more possibility to an elements to be in the final seq due to its higher prob. In few words: in the example, i prefer get seq with more 3-2 rather than 4, and no 1.
Here's an attempt to select elements with its prob, on 10 takes:
Randomizer randomizer;
int tN = 12;
std::vector<int> elem = {2, 3, 4};
std::vector<float> prob = {0.5f, 0.75f, 0.25f};
float probSum = std::accumulate(begin(prob), end(prob), 0.0f, std::plus<float>());
std::vector<float> probScaled;
for (size_t i = 0; i < prob.size(); i++)
{
probScaled.push_back((i == 0 ? 0.0f : probScaled[i - 1]) + (prob[i] / probSum));
}
for (size_t r = 0; r < 10; r++)
{
float rnd = randomizer.getRandomValue();
int index = 0;
for (size_t i = 0; i < probScaled.size(); i++)
{
if (rnd < probScaled[i])
{
index = i;
break;
}
}
std::cout << elem[index] << std::endl;
}
which gives, for example, this choice:
3
3
2
2
4
2
2
4
3
3
Now i just need to build a multiset which sum up to tN. Any tips?
Question - Given an array of integers, A of length N, find the length of longest subsequence which is first increasing then decreasing.
Input:[1, 11, 2, 10, 4, 5, 2, 1]
Output: 6
Explanation:[1 2 10 4 2 1] is the longest subsequence.
I wrote a top-down approach. I have five arguments - vector A(containing the sequence), start index(denoting the current index), previous value, large(denoting maximum value in current subsequence) and map(m) STL.
For the backtrack approach I have two cases -
element is excluded - In this case we move to next element(start+1). prev and large remains same.
element is included - having two cases
a. if current value(A[start]) is greater than prev and prev == large then this is the case
of increasing sequence. Then equation becomes 1 + LS(start+1, A[start], A[start]) i.e.
prev becomes current element(A[start]) and largest element also becomes A[start].
b. if current value (A[start]) is lesser than prev and current (A[start]) < large then
this is the case of decreasing sequence. Then equation becomes 1 + LS(start+1, A[start],
large) i.e. prev becomes current element(A[start]) and largest element remains same i.e.
large.
Base Cases -
if current index is out of the array i.e. start == end then return 0.
if sequence is decreasing and then increasing then return 0.
i.e. if(current> previous and previous < maximum value) then return 0.
This is not an optimized approach approach as map.find() is itself a costly operation. Can someone suggest optimized top-down approach with memoization.
int LS(const vector<int> &A, int start, int end, int prev, int large, map<string, int>&m){
if(start == end){return 0;}
if(A[start] > prev && prev < large){
return 0;
}
string key = to_string(start) + '|' + to_string(prev) + '|' + to_string(large);
if(m.find(key) == m.end()){
int excl = LS(A, start+1, end, prev, large, m);
int incl = 0;
if(((A[start] > prev)&&(prev==large))){
incl = 1 + LS(A, start+1, end, A[start],A[start], m);
}else if(((A[start]<prev)&&(A[start]<large))){
incl = 1+ LS(A, start+1, end, A[start], large, m);
}
m[key] = max(incl, excl);
}
return m[key];
}
int Solution::longestSubsequenceLength(const vector<int> &A) {
map<string, int>m;
return LS(A, 0, A.size(), INT_MIN, INT_MIN, m);
}
Not sure about top-down but it seems we could use the classic LIS algorithm to just approach each element from "both sides" as it were. Here's the example with each element as the rightmost and leftmost, respectively, as we iterate from both directions. We can see three instances of a valid sequence of length 6:
[1, 11, 2, 10, 4, 5, 2, 1]
1 11 11 10 4 2 1
1 2 2 1
1 2 10 10 4 2 1
1 2 4 4 2 1
1 2 4 5 5 2 1
1 2 2 1
I am trying to find the Time Complexity of this algorithm.
The iterative: algorithm produces all the bit-strings within a given Hamming distance, from the input bit-string. It generates all increasing sequences 0 <= a[0] < ... < a[dist-1] < strlen(num), and reverts bits at corresponding indices.
The vector a is supposed to keep indices for which bits have to be inverted. So if a contains the current index i, we print 1 instead of 0 and vice versa. Otherwise we print the bit as is (see else-part), as shown below:
// e.g. hamming("0000", 2);
void hamming(const char* num, size_t dist) {
assert(dist > 0);
vector<int> a(dist);
size_t k = 0, n = strlen(num);
a[k] = -1;
while (true)
if (++a[k] >= n)
if (k == 0)
return;
else {
--k;
continue;
}
else
if (k == dist - 1) {
// this is an O(n) operation and will be called
// (n choose dist) times, in total.
print(num, a);
}
else {
a[k+1] = a[k];
++k;
}
}
What is the Time Complexity of this algorithm?
My attempt says:
dist * n + (n choose t) * n + 2
but this seems not to be true, consider the following examples, all with dist = 2:
len = 3, (3 choose 2) = 3 * O(n), 10 while iterations
len = 4, (4 choose 2) = 6 * O(n), 15 while iterations
len = 5, (5 choose 2) = 9 * O(n), 21 while iterations
len = 6, (6 choose 2) = 15 * O(n), 28 while iterations
Here are two representative runs (with the print to be happening at the start of the loop):
000, len = 3
k = 0, total_iter = 1
vector a = -1 0
k = 1, total_iter = 2
vector a = 0 0
Paid O(n)
k = 1, total_iter = 3
vector a = 0 1
Paid O(n)
k = 1, total_iter = 4
vector a = 0 2
k = 0, total_iter = 5
vector a = 0 3
k = 1, total_iter = 6
vector a = 1 1
Paid O(n)
k = 1, total_iter = 7
vector a = 1 2
k = 0, total_iter = 8
vector a = 1 3
k = 1, total_iter = 9
vector a = 2 2
k = 0, total_iter = 10
vector a = 2 3
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gsamaras#pythagoras:~/Desktop/generate_bitStrings_HammDistanceT$ ./iter
0000, len = 4
k = 0, total_iter = 1
vector a = -1 0
k = 1, total_iter = 2
vector a = 0 0
Paid O(n)
k = 1, total_iter = 3
vector a = 0 1
Paid O(n)
k = 1, total_iter = 4
vector a = 0 2
Paid O(n)
k = 1, total_iter = 5
vector a = 0 3
k = 0, total_iter = 6
vector a = 0 4
k = 1, total_iter = 7
vector a = 1 1
Paid O(n)
k = 1, total_iter = 8
vector a = 1 2
Paid O(n)
k = 1, total_iter = 9
vector a = 1 3
k = 0, total_iter = 10
vector a = 1 4
k = 1, total_iter = 11
vector a = 2 2
Paid O(n)
k = 1, total_iter = 12
vector a = 2 3
k = 0, total_iter = 13
vector a = 2 4
k = 1, total_iter = 14
vector a = 3 3
k = 0, total_iter = 15
vector a = 3 4
The while loop is somewhat clever and subtle, and it's arguable that it's doing two different things (or even three if you count the initialisation of a). That's what's making your complexity calculations challenging, and it's also less efficient than it could be.
In the abstract, to incrementally compute the next set of indices from the current one, the idea is to find the last index, i, that's less than n-dist+i, increment it, and set the following indexes to a[i]+1, a[i]+2, and so on.
For example, if dist=5, n=11 and your indexes are:
0, 3, 5, 9, 10
Then 5 is the last value less than n-dist+i (because n-dist is 6, and 10=6+4, 9=6+3, but 5<6+2).
So we increment 5, and set the subsequent integers to get the set of indexes:
0, 3, 6, 7, 8
Now consider how your code runs, assuming k=4
0, 3, 5, 9, 10
a[k] + 1 is 11, so k becomes 3.
++a[k] is 10, so a[k+1] becomes 10, and k becomes 4.
++a[k] is 11, so k becomes 3.
++a[k] is 11, so k becomes 2.
++a[k] is 6, so a[k+1] becomes 6, and k becomes 3.
++a[k] is 7, so a[k+1] becomes 7, and k becomes 4.
++a[k] is 8, and we continue to call the print function.
This code is correct, but it's not efficient because k scuttles backwards and forwards as it's searching for the highest index that can be incremented without causing an overflow in the higher indices. In fact, if the highest index is j from the end, the code uses a non-linear number iterations of the while loop. You can easily demonstrate this yourself if you trace how many iterations of the while loop occur when n==dist for different values of n. There is exactly one line of output, but you'll see an O(2^n) growth in the number of iterations (in fact, you'll see 2^(n+1)-2 iterations).
This scuttling makes your code needlessly inefficient, and also hard to analyse.
Instead, you can write the code in a more direct way:
void hamming2(const char* num, size_t dist) {
int a[dist];
for (int i = 0; i < dist; i++) {
a[i] = i;
}
size_t n = strlen(num);
while (true) {
print(num, a);
int i;
for (i = dist - 1; i >= 0; i--) {
if (a[i] < n - dist + i) break;
}
if (i < 0) return;
a[i]++;
for (int j = i+1; j<dist; j++) a[j] = a[i] + j - i;
}
}
Now, each time through the while loop produces a new set of indexes. The exact cost per iteration is not straightforward, but since print is O(n), and the remaining code in the while loop is at worst O(dist), the overall cost is O(N_INCR_SEQ(n, dist) * n), where N_INCR_SEQ(n, dist) is the number of increasing sequences of natural numbers < n of length dist. Someone in the comments provides a link that gives a formula for this.
Notice, that given n which represents the length, and t which represents the distance required, the number of increasing, non-negative series of t integers between 1 and n (or in indices form, between 0 and n-1) is indeed n choose t, since we pick t distinct indices.
The problem occurs with your generation of those series:
-First, notice that for example in the case of length 4, you actually go over 5 different indices, 0 to 4.
-Secondly, notice that you are taking in account series with identical indices (in the case of t=2, its 0 0, 1 1, 2 2 and so on), and generally, you would go through every non-decreasing series, instead of through every increasing series.
So for calculating the TC of your program, make sure you take that into account.
Hint: try to make one-to-one correspondence from the universe of those series, to the universe of integer solutions to some equation.
If you need the direct solution, take a look here :
https://math.stackexchange.com/questions/432496/number-of-non-decreasing-sequences-of-length-m
The final solution is (n+t-1) choose (t), but noticing the first bullet, in your program, its actually ((n+1)+t-1) choose (t), since you loop with one extra index.
Denote
((n+1)+t-1) choose (t) =: A , n choose t =: B
overall we get O(1) + B*O(n) + (A-B)*O(1)
I have come across a problem where we want to tell the maximum size of the longest increasing sub-sequence.
an array A consisting of N integers.
M queries (Li, Ri)
for each query we wants to find the length of the longest increasing subsequence in
array A[Li], A[Li + 1], ..., A[Ri].
I implemented finding the sub-sequence using dp approach
// mind the REPN, LLD, these are macros I use for programming
// LLD = long long int
// REPN(i, a, b) = for (int i = a; i < b; ++i)
LLD a[n], dp[n];
REPN(i, 0, n)
{
scanf("%lld", &a[i]);
dp[i] = 1;
}
REPN(i, 1, n)
{
REPN(j, 0, i)
{
if(a[i] > a[j])
dp[i] = std::max(dp[j] + 1, dp[i]);
}
}
For example:
Array: 1 3 8 9 7 2 4 5 10 6
dplis: 1 2 3 4 3 1 3 4 5 5
max: 5
But if it was for range Li=2 & Ri=9
Then:
Array: 3 8 9 7 2 4 5 10
dplis: 1 2 3 2 1 2 3 4
max: 4
How can i determine the maximum longest increasing sub-sequence in a sub array?
PS: I don't want to recompute the whole dplis array, I want to use the original one because too much computation will kill the purpose of the question.
One of the approaches was to construct a complete 2D DP array that consists of sub-sequence from position i where range of i is from 0 to n, but it fails on many cases due to TLE(Time limit exceeded)
REPN(k,0,n) {
REPN(i,k+1,n) {
REPN(j,k,i) {
if(a[i]>a[j]) dp[k][i]=std::max(dp[k][j]+1, dp[k][i]);
}
}
}
REPN(i,0,q) {
read(l); read(r);
LLD max=-1;
REPN(i,0,r) {
if(max<dp[l-1][i]) max=dp[l-1][i];
}
printf("%lld\n", max);
}
If you have any new logic/implementation, I will gladly study it in-depth. Cheers.
Do you know some algorithm(better than brute-force) which can find vertex in the graph that are separated by one vertex and aren't connected between each other. Example:
In this graph found paths would be:
1 - 4
2 - 4
3 - 5
The best would be c++ code which uses array of stl lists as a graph representation but code in any other procedural language or pseudocode would also be fine.
One way would be based on a breadth-first-style search, where for each vertex i in the graph, we scan the vertices adjacent to those adjacent to i (i.e. two levels of adjacency!).
mark = array[0..n-1] of 0
flag = 1
for i = nodes in graph do
// mark pattern of nodes adjacent to i
mark[i] = flag
for j = nodes adjacent to i do
mark[j] = flag
endfor
// scan nodes adjacent to those adjacent to i
// (separated by one vertex!)
for j = nodes adjacent to i do
for k = nodes adjacent to j do
if mark[k] != flag and k > i then
// i,k are separated by another vertex
// and there is no edge i,k
// prevent duplicates
mark[k] = flag
endif
endfor
endfor
// implicit unmarking of current pattern
flag += 1
endfor
If the graph had m edges per vertex, this would be an O(n * m^2) algorithm that requires O(n) extra space.
One simple and intuitive solution to this problem lies in the adjacency matrix. As we know, (i,j) th element of the nth power of an adjacency matrix lists all the paths of length exactly n between i and j.
So i just read in A, the adjacency matrix and then calculate A^2. Finally, i list all the pairs which have exactly one path of length 2 between them.
//sg
#include<stdio.h>
#define MAX_NODE 10
int main()
{
int a[MAX_NODE][MAX_NODE],c[MAX_NODE][MAX_NODE];
int i,j,k,n;
printf("Enter the number of nodes : ");
scanf("%d",&n);
for(i=0;i<n;i++)
for(j=0;j<=i;j++)
{
printf("Edge from %d to %d (1 yes/0 no) ? : ",i+1,j+1);
scanf("%d",&a[i][j]);
a[j][i]=a[i][j]; //undirected graph
}
//dump the graph
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
c[i][j]=0;
printf("%d",a[i][j]);
}
printf("\n");
}
printf("\n");
//multiply
for(i=0;i<n;i++)
for(j=0;j<n;j++)
for(k=0;k<n;k++)
{
c[i][j]+=a[i][k]*a[k][j];
}
//result of the multiplication
for(i=0;i<n;i++)
{
for(j=0;j<n;j++)
{
printf("%d",c[i][j]);
}
printf("\n");
}
for(i=0;i<n;i++)
for(j=0;j<=i;j++)
{
if(c[i][j]==1&&(!a[i][j])&&(i!=j)) //list the paths
{
printf("\n%d - %d",i+1, j+1 );
}
}
return 0;
}
Sample Run For Your Graph
[aman#aman c]$ ./Adjacency2
Enter the number of nodes : 5
Edge from 1 to 1 (1 yes/0 no) ? : 0
Edge from 2 to 1 (1 yes/0 no) ? : 1
Edge from 2 to 2 (1 yes/0 no) ? : 0
Edge from 3 to 1 (1 yes/0 no) ? : 1
Edge from 3 to 2 (1 yes/0 no) ? : 1
Edge from 3 to 3 (1 yes/0 no) ? : 0
Edge from 4 to 1 (1 yes/0 no) ? : 0
Edge from 4 to 2 (1 yes/0 no) ? : 0
Edge from 4 to 3 (1 yes/0 no) ? : 1
Edge from 4 to 4 (1 yes/0 no) ? : 0
Edge from 5 to 1 (1 yes/0 no) ? : 0
Edge from 5 to 2 (1 yes/0 no) ? : 0
Edge from 5 to 3 (1 yes/0 no) ? : 0
Edge from 5 to 4 (1 yes/0 no) ? : 1
Edge from 5 to 5 (1 yes/0 no) ? : 0
01100
10100
11010
00101
00010
21110
12110
11301
11020
00101
4 - 1
4 - 2
5 - 3
Analysis
For n vertices :
Time : O(n^3) . Can be reduced to O(n^2.32), which is very good.
Space : O(n^2).
You can do this with a adapted version of Warshall's algorithm. The algorithm in the following code example uses the adjacency matrix of your graph and prints i j if there
is a edge from i to k and a edge from k to j but no direct way from i to j.
#include <iostream>
int main() {
// Adjacency Matrix of your graph
const int n = 5;
bool d[n][n] = {
{ 0, 1, 1, 0, 0 },
{ 0, 0, 1, 0, 0 },
{ 0, 0, 0, 1, 0 },
{ 0, 0, 0, 0, 1 },
{ 0, 0, 0, 0, 0 },
};
// Modified Warshall Algorithm
for (int k = 0; k < n; k++)
for (int i = 0; i < n; i++)
if (d[i][k])
for (int j = 0; j < n; j++)
if (d[k][j] && !d[i][j])
std::cout << i + 1 << " " j + 1 << std::endl;
}
You can view the result online.