all solutions to change making with dynamic programming - c++

I was reviewing my handouts for our algorithm class and I started to think about this question:
Given different types of coins with different values, find all coin configurations to add up to a certain sum without duplication.
During class, we solved the problem to find the number of all possible ways for a sum and the least number of coins for a sum. However, we never tried to actually find the solutions.
I was thinking about solving this problem with dynamic programming.
I came with the recursion version(for simplicity I only print the solutions):
void solve(vector<string>& result, string& currSoln, int index, int target, vector<int>& coins)
{
if(target < 0)
{
return;
}
if(target == 0)
{
result.push_back(currSoln);
}
for(int i = index; i < coins.size(); ++i)
{
stringstream ss;
ss << coins[i];
string newCurrSoln = currSoln + ss.str() + " ";
solve(result, newCurrSoln, i, target - coins[i], coins);
}
}
However, I got stuck when trying to use DP to solve the problem.
I have 2 major obstacles:
I don't know what data structure I should use to store previous answers
I don't know what my bottom-up procedure(using loops to replace recursions) should look like.
Any help is welcomed and some codes would be appreciated!
Thank you for your time.

In a dp solution you generate a set of intermediate states, and how many ways there are to get there. Then your answer is the number that wound up in a success state.
So, for change counting, the states are that you got to a specific amount of change. The counts are the number of ways of making change. And the success state is that you made the correct amount of change.
To go from counting solutions to enumerating them you need to keep those intermediate states, and also keep a record in each state of all of the states that transitioned to that one - and information about how. (In the case of change counting, the how would be which coin you added.)
Now with that information you can start from the success state and recursively go backwards through the dp data structures to actually find the solutions rather than the count. The good news is that all of your recursive work is efficient - you're always only looking at paths that succeed so waste no time on things that won't work. But if there are a billion solutions, then there is no royal shortcut that makes printing out a billion solutions fast.
If you wish to be a little clever, though, you can turn this into a usable enumeration. You can, for instance, say "I know there are 4323431 solutions, what is the 432134'th one?" And finding that solution will be quick.

It is immediately obvious that you can take a dynamic programming approach. What isn't obvious that in most cases (depending on the denominations of the coins) you can use the greedy algorithm, which is likely to be more efficient. See Cormen, Leiserson, Rivest, Stein: Introduction to Algorithms 2nd ed, problems 16.1.

Related

Is this connected-component labeling algorithm new?

A long time ago, I made a game in which a sort of connected-component labeling is required to implement AI part. I used the two-pass algorithm unknowingly at that time.
Recently, I got to know that I can make them faster using bit-scan based method instead. It uses 1-bit-per-pixel data as input, instead of typical bytes-per-pixel input. Then it finds every linear chunks in each scan-lines using BSF instruction. Please see the code below. Cut is a struct which saves information of a linear chunk of bit 1s in a scan-line.
Cut* get_cuts_in_row(const u32* bits, const u32* bit_final, Cut* cuts) {
u32 working_bits = *bits;
u32 basepos = 0, bitpos = 0;
for (;; cuts++) {
//find starting position
while (!_BitScanForward(&bitpos, working_bits)) {
bits++, basepos += 32;
if (bits == bit_final) {
cuts->start_pos = (short)0xFFFF;
cuts->end_pos = (short)0xFFFF;
return cuts + 1;
}
working_bits = *bits;
}
cuts->start_pos = short(basepos + bitpos);
//find ending position
working_bits = (~working_bits) & (0xFFFFFFFF << bitpos);
while (!_BitScanForward(&bitpos, working_bits)) {
bits++, basepos += 32;
working_bits = ~(*bits);
}
working_bits = (~working_bits) & (0xFFFFFFFF << bitpos);
cuts->end_pos = short(basepos + bitpos);
}
}
First, it uses the assembly BSF instruction to find the first position bit 1 appears. After it is found, it finds the the first position bit 0 appears after that position using bit inversion and bit masking, then repeats this process.
After getting the starting position and the ending position of all linear chunks of 1s (I prefer refer them as 'cuts') in every scan-line, it gives labels to them in CCL manner. For the first row, every cuts get different labels.
For each cut in rest rows, it checks if there are upper cuts which are connected to it first. If no upper cuts are connected to it, it gets new label. If only one upper cuts are connected to it, it gets the copy of the label. If many upper cuts are connected to it, those labels are merged and it gets the merged one. This can be done easily using two progressing pointers of upper chunks and lower chunks. Here is the full code doing that part.
Label* get_labels_8c(Cut* cuts, Cut* cuts_end, Label* label_next) {
Cut* cuts_up = cuts;
//generate labels for the first row
for (; cuts->start_pos != 0xFFFF; cuts++) cuts->label = [GET NEW LABEL FROM THE POOL];
cuts++;
//generate labels for the rests
for (; cuts != cuts_end; cuts++) {
Cut* cuts_save = cuts;
for (;; cuts++) {
u32 start_pos = cuts->start_pos;
if (start_pos == 0xFFFF) break;
//Skip upper slices ends before this slice starts
for (; cuts_up->end_pos < start_pos; cuts_up++);
//No upper slice meets this
u32 end_pos = cuts->end_pos;
if (cuts_up->start_pos > end_pos) {
cuts->label = [GET NEW LABEL FROM THE POOL];
continue;
};
Label* label = label_equiv_recursion(cuts_up->label);
//Next upper slice can not meet this
if (end_pos <= cuts_up->end_pos) {
cuts->label = label;
continue;
}
//Find next upper slices meet this
for (; cuts_up->start_pos <= end_pos; cuts_up++) {
Label* label_other = label_equiv_recursion(cuts_up->label);
if (label != label_other) [MERGE TWO LABELS]
if (end_pos <= cuts_up->end_pos) break;
}
cuts->label = label;
}
cuts_up = cuts_save;
}
return label_next;
}
After this, one can use these information for each scan-line to make the array of labels or any output he want directly.
I checked the execution time of this method and then I found that it's much faster the two-scan method I previously used. Surprisingly, it turned out to be much faster than the two-scan one even when the input data is random. Apparently the bit-scanning algorithm is best for data with relatively simple structures where each chunks in scan-lines are big. It wasn't designed to be used on random images.
What baffled me was that literally nobody says about this method. Frankly speaking, it doesn't seem to be an idea that hard to come up with. It's hard to believe that I'm the first one who tried it.
Perhaps while my method is better than the primitive two-scan method, it's worse than more developed ones based on the two-scan idea so that anyway doesn't worth to be mentioned.
However, if the two-scan method can be improved, the bit-scan method also can. I myself found a nice improvement for 8-connectivity. It analyses neighboring two scan-lines at ones my merging them using the bit OR instruction. You can find the full codes and detailed explanation on how they work here.
I got to know that there is a benchmark for CCL algorithms named YACCLAB. I'll test my algorithms in this with best CCL algorithms to see how really good they are. Before that, I want to ask several things here.
My question is,
Are these algorithms I found really new? It's still hard to believe that nobody has ever thought CCL algorithm using bit-scanning. If it's already a thing, why I can't found anyone says about it? Were bit-scan based algorithms proven to be bad and forgotten?
If I really found a new algorithm, what should I do next? Of course I'll test it in more reliable system like YACCLAB. I'm questioning about what I should do next. What I should to to make these algorithms mine and spread them?
So far, I'm a bit sceptical
My reasoning was getting too long for a comment, so here we are. There is a lot to unpack. I like the question quite a bit even though it might be better suited for a computer science site.
The thing is, there are two layers to this question:
Was a new algorithm discovered?
What about the bit scanning part?
You are combining these two so first I will explain why I would like to think about them separately:
The algorithm is a set of steps(more formal definition) that is language-agnostic. As such it should work even without the need for the bitscanning.
The bit scanning on the other hand I would consider to be an optimization technique - we are using a structure that the computer is comfortable with which can bring us performance gains.
Unless we separate these two, the question gets a bit fuzzy since there are several possible scenarios that can be happening:
The algorithm is new and improved and bit scanning makes it even faster. That would be awesome.
The algorithm is just a new way of saying "two pass" or something similar. That would still be good if it beats the benchmarks. In this case it might be worth adding to a library for the CCL.
The algorithm is a good fit for some cases but somehow fails in others(speed-wise, not correction wise). The bit scanning here makes the comparison difficult.
The algorithm is a good fit for some cases but completely fails in others(produces incorrect result). You just didn't find a counterexample yet.
Let us assume that 4 isn't the case and we want to decide between 1 to 3. In each case, the bit scanning is making things fuzzy since it most likely speeds up things even more - so in some cases even a slower algorithm could outperform a better one.
So first I would try and remove the bit scanning and re-evaluate the performance. After a quick look it seems that the algorithms for the CCL have a linear complexity, depending on the image size - you need to check every pixel at least once. Rest is the fight for lowering the constant as much as possible. (Number of passes, number of neighbors to check etc.) I think it is safe to assume, that you can't do better then linear - so the first question is: does your algorithm improve on the complexity by a multiplicative constant? Since the algorithm is linear, the factor directly translates to performance which is nice.
Second question would then be: Does bit scanning further improve the performance of the algorithm?
Also, since I already started thinking about it, what about a chess-board pattern and 4-connectivity? Or alternatively, a chessboard of 3x3 crosses for the 8-connectivity.

Shouldn't this be using a backtracking algorithm?

I am solving some questions on LeetCode. One of the questions is:
Given a m x n grid filled with non-negative numbers, find a path from top left to bottom right which minimizes the sum of all numbers along its path.You can only move either down or right at any point in time.
The editorial as well as the solutions posted all use dynamic programming. One of the most upvoted solution is as follows:
class Solution {
public:
int minPathSum(vector<vector<int>>& grid) {
int m = grid.size();
int n = grid[0].size();
vector<vector<int> > sum(m, vector<int>(n, grid[0][0]));
for (int i = 1; i < m; i++)
sum[i][0] = sum[i - 1][0] + grid[i][0];
for (int j = 1; j < n; j++)
sum[0][j] = sum[0][j - 1] + grid[0][j];
for (int i = 1; i < m; i++)
for (int j = 1; j < n; j++)
sum[i][j] = min(sum[i - 1][j], sum[i][j - 1]) + grid[i][j];
return sum[m - 1][n - 1];
}
};
My question is simple: shouldn't this be solved using backtracking? Suppose the input matrix is something like:
[
[1,2,500]
[100,500,500]
[1,3,4]
]
My doubt is because in DP, the solutions to subproblems are a part of the global solution (optimal substructure). However, as can be seen above, when we make a local choice of choosing 2 out of (2,100), we might be wrong, since the future paths might be too expensive (all numbers surrounding 2 are 500s). So, how is using dynamic programming justified in this case?
To summarize:
Shouldn't we use backtracking since we might have to retract our path if we have made an incorrect choice previously (looking at local maxima)?
How is this a dynamic programming question?
P.S.: The above solution definitely runs.
The example you illustrated above shows that a greedy solution to the problem will not necessarily produce an optimal solution, and you're absolutely right about that.
However, the DP solution to this problem doesn't quite use this strategy. The idea behind the DP solution is to compute, for each location, the cost of the shortest path ending at that location. In the course of solving the overall problem, the DP algorithm will end up computing the length of some shortest paths that pass through the 2 in your grid, but it won't necessarily use those intermediate shortest paths when determining the overall shortest path to return. Try tracing through the above code on your example - do you see how it computes and then doesn't end up using those other path options?
Shouldn't we use backtracking since we might have to retract our path if we have made an incorrect choice previously (looking at local maxima)?
In a real-world scenario, there will be quite a few factors that will determine which algorithm will be better suited to solve this problem.
This DP solution is alright in the sense that it will give you the best performance/memory usage when handling worst-case scenarios.
Any backtracking/dijkstra/A* algorithm will need to maintain a full matrix as well as a list of open nodes. This DP solution just assumes every node will end up being visited, so it can ditch the open node list and just maintain the costs buffer.
By assuming every node will be visited, it also gets rid of the "which node do I open next" part of the algorithm.
So if optimal worst-case scenario performance is what we are looking for, then this algorithm is actually going to be very hard to beat. But wether that's what we want or not is a different matter.
How is this a dynamic programming question?
This is only a dynamic programming question in the sense that there exists a dynamic programming solution for it. But by no means is DP the only way to tackle it.
Edit: Before I get dunked on, yes there are more memory-efficient solutions, but at very high CPU costs in the worst-case scenarios.
For your input
[
[ 1, 2, 500]
[100, 500, 500]
[ 1, 3, 4]
]
sum array results to
[
[ 1, 3, 503]
[101, 503, 1003]
[102, 105, 109]
]
And we can even retrace shortest path:
109, 105, 102, 101, 1
Algorithm doesn't check each path, but use the property that it can take previous optimum path to compute current cost:
sum[i][j] = min(sum[i - 1][j], // take better path between previous horizontal
sum[i][j - 1]) // or previous vertical
+ grid[i][j]; // current cost
Backtracking, in itself, doesn't fit this problem particularly well.
Backtracking works well for problems like eight queens, where a proposed solution either works, or it doesn't. We try a possible route to a solution, and if it fails, we backtrack and try another possible route, until we find one that works.
In this case, however, every possible route gets us from the beginning to the end. We can't just try different possibilities until we find one that works. Instead, we have to basically try every route from beginning to end, until one find the one that works the best (the lowest weight, in this case).
Now, it's certainly true that with backtracking and pruning, we could (perhaps) improve our approach to this solution, to at least some degree. In particular, let's assume you did a search that started by looking downward (if possible) and then to the side. In this case, with the input you gave its first attempt would end up being the optimal route.
The question is whether it can recognize that, and prune some branches of the tree without traversing them entirely. The answer is that yes, it can. To do that, it keeps track of the best route it's found so far, and based upon that, it can reject entire sub-trees. In this case its first route gives a total weight of 109. Then it tries to the right of the first node, which is a 2, for a total weight of 3 so far. That's smaller than 109, so it proceeds. From there, it looks downward and gets to the 500. That gives a weight of 503, so without doing any further looking, it knows no route from there can be suitable, so it stops and prunes off all the branches that start from that 500. Then it tries rightward from the 2 and finds another 500. This lets it prune that entire branch as well. So, in these cases, it never looks at the third 500, or the 3 and 4 at all--just by looking at the 500 nodes, we can determine that those can't possibly yield an optimal solution.
Whether that's really an improvement on the DP strategy largely comes down to a question of what operations cost how much. For the task at hand, it probably doesn't make much difference either way. If, however, your input matrix was a lot larger, it might. For example, we might have a large input stored in tiles. With a DP solution, we evaluate all the possibilities, so we always load all the tiles. With a tree-trimming approach, we might be able to completely avoid loading some tiles at all, because the routes including those tiles have already been eliminated.

Is one loop better than several of them?

I've been working on my implementation of BigInteger, and when I was contemplating the solution for addition, I decided to go with cleaner one, which had in mind adding corresponding digits in function and "normalizing" them later. Like in the following example
999 999 + 111 111
= 10 10 10 10 10 10 (value after addition)
= 1 111 110 (value after normalization)
But since then I was wondering about how it affects the efficiency of the program. Are several loops doing small things each generally going to work faster than one big nested loop?
For example, using
int a[7]={0,9,9,9,9,9,9};
int b[7]={0,1,1,1,1,1,1};
int c[7];
Is this,
for(int q=0; q<7; ++q){
c[q]=a[q]+b[q];
if(c[q]>9){
c[q-1]=c[q]/10;
c[q]%=10;
}
}
better than this
for(int q=0; q<7; ++q){
c[q]=a[q]+b[q];
}
for(int q=0;q<7;++q){
if(c[q]>9){
c[q-1]=c[q]/10;
c[q]%=10;
}
}
And what about bigger loops, that have much more things to go through on each iteration?
UPD.
As someone suggested I did measure performance time for both examples. For two loops the average time (for 100mil. elements) ~4.85sec. For one loop ~3.72sec
It is very difficult to tell which one of the two approaches will be more efficient. It probably varies among C++ compiler vendors and within a single vendor, from version to version of their compiler.
The bottom line is:
You will never know unless you benchmark.
As usual, it is almost certain that it does not matter anyway, and you are most probably unduly concerned about performance, like the vast majority of programmers do, in the vast majority of cases.
At the end of the day, all that matters is what is more readable and more maintainable. Code maintainability is far more important than saving clock cycles.
If you decide to follow the wise path of "what is more readable" keep in mind that different folks find different things more readable. For example, I personally hate surprises when I am reading code, so I would be rather annoyed to read your first loop which allows decimal digits to receive erroneous values outside of the 0-9 range, only to find out later that you are finally remedying that with another loop.

Finding an optimal solution to a system of linear equations in c++

Here's the problem:
I am currently trying to create a control system which is required to find a solution to a series of complex linear equations without a unique solution.
My problem arises because there will ever only be six equations, while there may be upwards of 20 unknowns (usually way more than six unknowns). Of course, this will not yield an exact solution through the standard Gaussian elimination or by changing them in a matrix to reduced row echelon form.
However, I think that I may be able to optimize things further and get a more accurate solution because I know that each of the unknowns cannot have a value smaller than zero or greater than one, but it is free to take on any value in between them.
Of course, I am trying to create code that would find a correct solution, but in the case that there are multiple combinations that yield satisfactory results, I would want to minimize Sum of (value of unknown * efficiency constant) over all unknowns, i.e. Sigma[xI*eI] from I=0 to n, but finding an accurate solution is of a greater priority.
Performance is also important, due to the fact that this algorithm may need to be run several times per second.
So, does anyone have any ideas to help me on implementing this?
Edit: You might just want to stick to linear programming with equality and inequality constraints, but here's an interesting exact solution that does not incorporate the constraint that your unknowns are between 0 and 1.
Here's a powerpoint discussing your problem: http://see.stanford.edu/materials/lsoeldsee263/08-min-norm.pdf
I'll translate your problem into math to make things a bit easier to figure out:
you have a 6x20 matrix A and a vector x with 20 elements. You want to minimize (x^T)e subject to Ax=y. According to the slides, if you were just minimizing the sum of x, then the answer is A^T(AA^T)^(-1)y. I'll take another look at this as soon as I get the chance and see what the solution is to minimizing (x^T)e (ie your specific problem).
Edit: I looked in the powerpoint some more and near the end there's a slide entitled "General norm minimization with equality constraints". I am going to switch the notation to match the slide's:
Your problem is that you want to minimize ||Ax-b||, where b = 0 and A is your e vector and x is the 20 unknowns. This is subject to Cx=d. Apparently the answer is:
x=(A^T A)^-1 (A^T b -C^T(C(A^T A)^-1 C^T)^-1 (C(A^T A)^-1 A^Tb - d))
it's not pretty, but it's not as bad as you might think. There's really aren't that many calculations. For example (A^TA)^-1 only needs to be calculated once and then you can reuse the answer. And your matrices aren't that big.
Note that I didn't incorporate the constraint that the elements of x are within [0,1].
It looks like the solution for what I am doing is with Linear Programming. It is starting to come back to me, but if I have other problems I will post them in their own dedicated questions instead of turning this into an encyclopedia.

Is long long in C++ known to be very nasty in terms of precision?

The Given Problem:
Given a theater with n rows, m seats, and a list of seats that are reserved. Given these values, determine how many ways two friends can sit together in the same row.
So, if the theater was a size of 2x3 and the very first seat in the first row was reserved, there would be 3 different seatings that these two guys can take.
The Problem That I'm Dealing With
The function itself is supposed to return the number of seatings that there are based on these constraints. The return value is a long long.
I've gone through my code many many times...and I'm pretty sure that it's right. All I'm doing is incrementing this one value. However, ALL of the values that my function return differ from the actual solution by 1 or 2.
Any ideas? And if you think that it's just something wrong with my code, please tell me. I don't mind being called an idiot just as long as I learn something.
Unless you're overflowing or underflowing, it definitely sounds like something is wrong with your code. For integral types, there are no precision ambiguities in c or c++
First, C++ doesn't have a long long type. Second, in C99, long long can represent any integral value from LLONG_MIN (<= -2^63) to LLONG_MAX (>= 2^63 - 1) exactly. The problem lies elsewhere.
Given the description of the problem, I think it is unambiguous.
Normally, the issue is that you don't know if the order in which the combinations are taken is important or not, but the example clearly disambiguate: if the order was important we would have 6 solutions, not 3.
What is the value that your code gives for this toy example ?
Anyway I can add a few examples with my own values if you wish, so that you can compare against them, I can't do much more for you unless you post your code. Obviously, the rows are independent so I'm only going to show the result row by row.
X occupied seat
. free seat
1: X..X
1: .X..
2: X...X
3: X...X..
5: ..X.....
From a computation point of view, I should note it's (at least) an O(N) process where N is the number of seats: you have to inspect nearly each seat once, except the first (and last) ones in case the second (and next to last) are occupied; and that's effectively possible to solve this linearly.
From a technic point of view:
make sure you initialize your variable to 0
make sure you don't count too many seats on toy example
I'd be happy to help more but I would not like to give you the full solution before you have a chance to think it over and review your algorithm calmly.