I'm working on the Leetcode "Minimum Window Substring" practice problem:
Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "".
The testcases will be generated such that the answer is unique.
Example 1:
Input: s = "ADOBECODEBANC", t = "ABC"
Output: "BANC"
Example 2:
Input: s = "a", t = "a"
Output: "a"
Example 3:
Input: s = "a", t = "aa"
Output: ""
Explanation: Both 'a's from t must be included in the window. Since the largest window of s only has one 'a', return empty string.
My solution uses two maps to keep track of character counts:
strr map is to keep count of characters in the window and
patt map is for the given pattern string.
It also uses two indices, start and end, to keep track of the current window (which includes end).
The core of the solution is an outer loop that advances end, adding the new character to strr. It then runs an inner loop as long as the window is valid that:
checks & updates the shortest window seen so far
removes the first character in the window
advances start.
Once the outer loop finishes, the shortest window it encountered should be the answer.
#include <iostream>
#include <unordered_map>
bool check_map(std::unordered_map<char, int> patt, std::unordered_map<char, int> strr)
{
for(auto data:patt)
{
if(strr[data.first] != data.second)
return false;
}
return true;
}
std::string Substring(std::string s, std::string t)
{
std::unordered_map<char, int> patt;
std::unordered_map<char, int> strr;
std::string ans;
for(int i=0; i<t.length(); i++)
patt[t[i]]++;
int start = 0, length = INT_MAX;;
for(int end=0; end<s.length(); end++)
{
strr[s[end]]++;
while(check_map(patt, strr))
{
if(length > (end-start+1))
{
ans = s.substr(start, end+1);
length = end-start+1;
}
strr[s[start]]--;
if(strr[s[start]] == 0)
strr.erase(s[start]);
start++;
}
}
return ans;
}
int main()
{
std::string s = "ADOBECODEBANC",
pattern = "ABC";
std::cout << "String: " << s << std::endl
<< "Pattern: " << pattern << std::endl
<< "Minimum Window Substring is " << Substring(s, pattern) << std::endl;
return 0;
}
For example 1 from the problem, the program should return "BANC" but instead returns "ADOBEC". Program output:
String: ADOBECODEBANC
Pattern: ABC
Minimum Window Substring is ADOBEC
Where is the error in my code?
I am very sorry that I cannot answer your concrete question to “where is the error in my code”.
But what I can do, is to help you to understand the problem, develop an algorithm and show, one of many, potential solution.
The title of the question already implies, what algorithm shall be used: The so called “Sliding Window”-algorithm.
You will find a very good explanation from Said Sryheni here.
And for your problem, we will use the Flexible-Size Sliding window approach.
We will iterate over the source string character by character and wait, until we meet a certain condition. In this case, until we “saw” all characters that needs to be searched for. Then, we will find a window, in which all these characters are.
In the given example, the end of the sliding window is always the last read character from the source string. This, because the last read character fulfills the condition. Then we need to find the beginning of the window. In that case the position of the rightmost character (of the search characters) in the source string that still fulfills the condition.
Then we will continue to read the source string and wait for the next condition to be fulfilled. And then we will recalculate the sliding window positions.
By the way. The other characters, besides the search characters in the source string, are just noise and will only extend the width of the sliding window.
But how do we meet the condition?
And especially, since the order of the search characters does not matter, and, there can even be double characters in it?
The solution is that we will “count”.
First, we will count the occurrence of all characters in the search string. Additionally, we will use a second counter that indicates if all characters are matched.
Then, while iterating over the source string, we will decrement a counter for any character that we see. If the count of a search character hits the 0, then we will decrement the “Match” counter. And, if that is 0, we found all search characters and the condition is fulfilled. We can then come to the calculation of the window positions.
Please note: We will only decrement the match counter, if, after decrementing the character counter, this will be 0.
Example (I will omit the noise with the ‘x’es):
Search string “ABC”, source string: “xxAxxxxBBBxCAxx”.
Initial character counters will be 1,1,1, the match counter will be 3.
Reading the first ‘A’. Counters: 0,1,1 2
Reading the first ‘B’. Counters: 0,0,1 1
Reading the 2nd ‘B’. Counters: 0,-1,1 1 (We will decrement the match counter only if character counter hits the 0).
Reading the 3rd ‘B’. Counters: 0,-2,1 1 (We will decrement the match counter only if character counter hits the 0).
Reading the first ‘C’. Counters: 0,-2,0 0. The match counter is 0, the condition is fulfilled.
Please note. Negative character counts indicate that there are more of the same character further right.
Next, since the condition is fulfilled now, we will check the positions of the sliding window. The end position is clear. This is the last read character from the source string. This led to the fulfillment of the condition. So, easy.
To get the start position of the sliding window, we will check from the beginning of the source string, where we can find a search character. We will increment its count, and if the count is greater then 0, we will again increment the match count. If the match count is greater than 0, we found a start position. Counters now: 1,-2,0 1
The start position will be incremented for the next check. We will never start again with 0, but only with the last used start position.
OK, having found a start and end position, we have our first window and will look for potential smaller windows. We will continue to read the source string and check
After the calculation of the sliding window position, the counter will be: 1,-2,0 1
Reading the next ‘A’. Counters: 0,-2,0 0. Again, the condition is fulfilled.
We continue with sliding window detection. The last start position was pointing to the character ‘x’ after the first ‘A’
Increment start position and skip all ‘x’es. Continue
Reading the first ‘B’. Counters: 0,-1,0 0
Reading the 2nd ‘B’. Counters: 0,0,0 0
Reading the 3d ‘B’. Counters: 0,1,0 1. Window position calculation done. Start position is the 3rd B. This window is smaller than the previous one, so take it.
Since the source string is consumed, we are done and found the solution.
How to implement that. We will do a small abstraction of the counter and pack it into a mini class. That will encapsulate the inner handling of character and match counts and can be optimized later.
A counter, which works for all kind of char types could be implemented like the below:
struct SpecialCounterForGeneralChar {
std::unordered_map<char, int> individualLetter{};
int necessaryMatches{};
SpecialCounterForGeneralChar(const std::string& searchLetters) {
for (const char c : searchLetters) individualLetter[c]++;
necessaryMatches = individualLetter.size();
}
inline void incrementFor(const char c) {
individualLetter[c]++;
if (individualLetter[c] > 0)
++necessaryMatches;
}
inline void decrementFor(const char c) {
individualLetter[c]--;
if (individualLetter[c] == 0)
--necessaryMatches;
}
inline bool allLettersMatched() { return necessaryMatches == 0; }
};
If we know more about the input data and it is for example restricted to an 8 bit char, we can also use:
struct SpecialCounter {
char individualLetter[256]{};
int necessaryMatches{};
SpecialCounter(const std::string& searchLetters) {
for (const char c : searchLetters) {
if (individualLetter[c] == 0) ++necessaryMatches;
individualLetter[c]++;
}
}
inline void incrementFor(const char c) {
individualLetter[c]++;
if (individualLetter[c] > 0)
++necessaryMatches;
}
inline void decrementFor(const char c) {
individualLetter[c]--;
if (individualLetter[c] == 0)
--necessaryMatches;
}
inline bool allLettersMatched() { return necessaryMatches == 0; }
};
This will be slightly faster than the above (under the given restrictions)
And, then the rest of the program will then be just 15 lines of code.
The important message here is that we need to think very verylong, before we start to implement the first line of code.
A good selected algorithm and design, will help us to find an optimum solution.
Please see the complete example solution below:
#include <string>
#include <iostream>
#include <unordered_map>
#include <limits>
using Index = unsigned int;
// We want to hide the implementation of the special counter to the outside world
struct SpecialCounter {
char individualLetter[256]{};
int necessaryMatches{};
SpecialCounter(const std::string& searchLetters) {
for (const char c : searchLetters) {
if (individualLetter[c] == 0) ++necessaryMatches;
individualLetter[c]++;
}
}
inline void incrementFor(const char c) {
individualLetter[c]++;
if (individualLetter[c] > 0)
++necessaryMatches;
}
inline void decrementFor(const char c) {
individualLetter[c]--;
if (individualLetter[c] == 0)
--necessaryMatches;
}
inline bool allLettersMatched() { return necessaryMatches == 0; }
};
std::string solution(std::string toBeSearchedIn, std::string toBeSearchedFor) {
// Counter with somespecial properties
SpecialCounter counter(toBeSearchedFor);
// This will be slided. End of window is always last read character. Start of window may increase
Index currentWindowStart {};
// The potential solution
Index resultingWindowStart {};
Index resultingWindowWith{ std::numeric_limits<size_t>::max() };
// Iterate over all characters of the string under evaluation
for (Index index{}; index < toBeSearchedIn.length(); ++index) {
// We saw a character. So, subtract from characters to be searched
counter.decrementFor(toBeSearchedIn[index]);
// If we hit and found all necessary characters and adjusted the sliding windows start position
while (counter.allLettersMatched()) {
// Calculate start and width of sliding window. So, if we found a new, more narrow window
const unsigned int currentWindowWith{ index - currentWindowStart + 1 };
if (currentWindowWith < resultingWindowWith) {
// Remember one potential solution
resultingWindowWith = currentWindowWith;
resultingWindowStart = currentWindowStart;
}
// Now, for the sliding window. We saw and decremented thsi character before
// Now we see it in the sliding window and increment it again.
counter.incrementFor(toBeSearchedIn[currentWindowStart]);
// Slide start of window to one to the right
currentWindowStart++;
}
}
return (resultingWindowWith != std::numeric_limits<size_t>::max()) ? toBeSearchedIn.substr(resultingWindowStart, resultingWindowWith) : "No solution";
}
int main()
{
const std::string toBeSearchedIn{ "KKKADOBECODEBBBAANCKKK" };
const std::string toBeSearchedFor = { "AABBC" };
std::cout << "Solution:\n" << solution(toBeSearchedIn, toBeSearchedFor) << '\n';
}
Since the question is part of an attempt at an exercise, this answer will not present a complete solution to the exercise problem that inspired it. Instead, it will do just what is asked: it will point out the main issue with the posted code, and how it can be discovered.
Code Examination
An artful approach is to check for mismatches between the requirements, design, and implementation; artful because this approach is more an art than a science, and you can easily lead yourself astray. This basically involves running through design and through implementation in your head, as if you were the processor, though perhaps examining only small parts of the code at a time.
Some of the implementation looks fine, such as: end advancing along in the outer loop, checking for a smaller window (and replacing the previous smallest window). Some could stand closer examination, such as removing entries from the window histogram after checking that the window is valid (for algorithm correctness, it's very useful to think of good loop invariants, such as 'the window should always be valid', and ensure they always hold true).
However, when you look at check_map, there's a mismatch. One problem requirement is:
every character in t (including duplicates) is included in the window
While there is a slight ambiguity in the phrasing (if a character from t occurs in a window more than in t, is the window valid?), the straight reading of this requirement is that the count of a character in s must be at least the count of a character in t. In check_map, the counts are being compared exactly. This strongly suggests a place to examine more closely.
Testing
A semi-automated, systematic approach that can catch all sorts of bugs is using tests, both unit and integration (a search of this site and the web at large will explain these terms). One key part of tests is identifying edge cases to test. For example, if you try with the search string "ACBA" and pattern "AB", the example program correctly finds the minimum window "BA". However, for the search string "ACBBA", it returns "ACB" as the minimum window. This suggests the implementation has an issue with character counts, which makes check_map the prime suspect (and the lines that update strr the secondary suspect).
For another test, consider search string "A123B12345A12BA123A" and pattern "AAB". This has 3 potential windows, with the shortest in the middle. If you fix check_map and test your code against this test case, the code returns "A12BA123A", rather than "A12BA". This suggests something is either wrong with testing the window validity (check_map again) or with setting the answer. Some scaffolding code (e.g. printing start, end and ans when it's updated) will reveal the cause.
Debugging
The most general approach that can reveal an issue with implementation correctness is to use an interactive debugger. For the sample code, breakpoints can be set at various key points, such as beginning of loops and branches. You can furthermore make these breakpoints conditional, at the indices when the code should be finding new windows. If you do this, you'll find that check_map returns false in instances when you'd expect it to be returning true. From there, you can start stepping in to check_map to observe why it's doing this.
Once that's fixed, there is still an issue with the code, though you'll need a test case such as the one with "A123B12345A12BA123A" above, as the issue isn't apparent with the "ADOBECODEBANC" test case. Stepping through the inner loop and examining the various variables will reveal what's going wrong.
Check the API
Bugs basically all have one cause: you expect the code to do one thing, but it does something different. One source of this is misunderstanding an API, so it can be helpful to read the API documentation to make sure your understanding is correct. Typically, before going to the API you'll want to find the specific API calls that aren't behaving as you understand them, which debugging can reveal. I mention this because there is an API call in the sample code that is incorrect.
Conclusion
Each of the above approaches leads to the same bug: the comparison in check_map. Two of them also can lead to an additional bug, given a suitable test case.
Additional Notes
Efficiency
Substring examines & tracks not only those characters in t, but all characters. This leads to the inner loop body being executed (including updating ans) for every character in s, not only those that are present in the pattern. Generally, you should make an implementation correct, then make it efficient. However, in this case it's trivial to make Substring ignore characters that aren't in the pattern and is closer to the problem description.
Types
An earlier formulation of this answer, addressing an earlier formulation of the question, covered examining types to check that they're the most appropriate. For the updated question, this no longer leads to bug discovery.
One point from the early formulation still applies to designing a solution.
Conceptually, the most appropriate data type for the pattern characters and the characters in the current window would be a multiset. As the window shifts, characters can be added and removed simply from a multiset. The validity of the current window is a simple subset operation (pattern ⊆ window). However, multiset in the STL doesn't correspond to the mathematical multiset.
I found a recursion code in the
competitive programmer's handbook to do the same but I'm struggling to understand the logic behind it.
It states that:
Like subsets, permutations can be generated using recursion. The following
function search goes through the permutations of the set {0,1,...,n¡1}. The
function builds a vector permutation that contains the permutation, and the
search begins when the function is called without parameters.
void search() {
if (permutation.size() == n) {
// process permutation
} else {
for (int i = 0; i < n; i++) {
if (chosen[i]) continue;
chosen[i] = true;
permutation.push_back(i);
search();
chosen[i] = false;
permutation.pop_back();
}
}
}
Each function call adds a new element to permutation. The array chosen
indicates which elements are already included in the permutation. If the size of
permutation equals the size of the set, a permutation has been generated.
I can't seem to understand the proper intuition and the concept used.
Can somemone explain me what the code is doing and what's the logic behind it?
I will try to give you some intuition . The main idea is to backtrack . You basically build a solution until you face a dead end. When you do face a dead end , go back to the last position where you can do something different than what you did the last time . Let me walk through this simulation I have drawn for n = 3 .
First you have nothing . Take 1 , then 2 and then 3 . You have nowhere to go now i.e Dead End . You print your current permutation which is 123 What do you do now ? go back to 1 because you know you can make another path by taking 3 this time . So what do you get this time the same way ? 132 . Can you do anything more using 1 ? Nope . Now go back to having nothing and start the same thing over , now taking 2 . You get the point now , right ?
For the same thing which is happening where in the code :
void search() {
if (permutation.size() == n) /// DEAD END
{
// process permutation
}
else {
for (int i = 0; i < n; i++) {
if (chosen[i]) continue; /// you have already taken this in your current path , so ignore it now
chosen[i] = true; /// take it , as you haven't already
permutation.push_back(i);
search(); // go to the next step after taking this item
chosen[i] = false; // you have done all you could do with this , now get rid of it
permutation.pop_back();
}
}
}
You can split up the code like this:
void search() {
if (permutation.size() == n) {
// we have a valid permutation here
// process permutation
} else {
// The permutation is 'under construction'.
// The first k elements are fixed,
// n - k are still missing.
// So we must choose the next number:
for (int i = 0; i < n; i++) {
// some of them are already chosen earlier
if (chosen[i]) continue;
// if i is still free:
// signal to nested calls that i is occupied
chosen[i] = true;
// add it to the permutation
permutation.push_back(i);
// At the start of this nested call,
// the permutation will have the first (k + 1)
// numbers fixed.
search();
// Now we UNDO what we did before the recursive call
// and the permutation state becomes the same as when
// we entered this call.
// This allows us to proceed to the next iteration
// of the for loop.
chosen[i] = false;
permutation.pop_back();
}
}
}
The intuition could be that search() is "complete the current partially constructed permutation in every way possible and process all of them".
If it is already complete, we only need to process the one possible permutation.
If not, we can first choose the first number in every way possible, and, for each of those, complete the permutation recursively.
Im positioning sprites in a sliding puzzle game, but I have trouble to randomise the tile position
How can I check if a random move (arc4random) has already been made, and to ignore previous move in the randomisation process?
the tiles do randomise/reshuffle, but sometimes the repeat the random move made
eg tile 23 slides to tile 24 position and back several times, counting as a random move
(which means the board doesn't shuffle properly)
int count = 0;
int moveArray[5];
int GameTile;
int EmptySq;
//loop through the board and find the empty square
for (GameTile = 0; GameTile <25; ++GameTile) {
if (boardOcc[GameTile]== kEMPTY) {
EmptySq = GameTile;
break;
}
}
int RowEmpty = RowNumber[GameTile];
int colEmpty = ColHeight[GameTile];
if (RowEmpty <4) moveArray[count++] = (GameTile +5);//works out the current possible move
if (RowEmpty >0) moveArray[count++] = (GameTile -5);//to avoid unsolvable puzzles
if (colEmpty <4) moveArray[count++] = (GameTile +1);
if (colEmpty >0) moveArray[count++] = (GameTile -1);
int RandomIndex = arc4random()%count;
int RandomFrom = moveArray[RandomIndex];
boardOcc[EmptySq] = boardOcc[RandomFrom];
boardOcc[RandomFrom] = kEMPTY;
There are few - if not many possibilities.
One possibility would be to create a stack-like buffer array, which would contain, let's say 10 steps. (Stack - goes in from one end, goes out from other end)
For example:
NSMutableArray *_buffer = [NSMutableArray new];
So - game starts, buffer array is empty. You generate first random move, and also insert it into the buffer array:
[_buffer insertObject:[NSNumber numberWithInt:RandomIndex] atIndex:0];
Then run a check if our array contains more that 10 elements and remove last one if so:
if([_buffer count] > 10)
{
[_buffer removeObjectAtIndex:10];
}
We need to remove only one item, as we only add one object each time.
And then we add the checking, so that next 'RandomIndex' would be something else than previous 10 indexes. We set 'RandomIndex' to some neutral value (-1) and then launch a while loop (to set 'RandomIndex' to some random value, and second time check if '_buffer' contains such value. If it contains, it will regenerate 'RandomIndex' and check again.. it could do so indefinitely, but if the 'count' is a much bigger number, then it will take 2-3 while loops, tops. No worries.
int RandomIndex = -1;
while(RandomIndex == -1 || [_buffer containsObject:[NSNumber numberWithInt:RandomIndex]])
{
RandomIndex = arc4random()%count;
}
But You could add some safety, to allow it to break out of the loop if after, say, 5 cycles: (But then it will keep the repeating value..)
int RandomIndex = -1;
int safetyCounter = 0;
while(RandomIndex == -1 || [_buffer containsObject:[NSNumber numberWithInt:RandomIndex]])
{
RandomIndex = arc4random()%count;
if(safetyCounter == 5)
{
break;
}
safetyCounter++;
}
You could also decrease the buffer size to - 3 or five, then it will work perfectly in 99.9999999% cases or even 100%. Just to disable that case, when randomly it picks the same number each second time as You described. Anyways - no worries.
But still. Let's discuss another - a bit more advanced and safer way.
Other possibility would be to create two separate buffers. One - as in previous example - would be used to store last 10 values, and second would have all the other possible unique moves.
So:
NSMutableArray *_buffer = [NSMutableArray new];
NSMutableArray *_allValues = [NSMutableArray new];
At the beginning '_buffer' is empty, but for '_allValues', we add all possible moves:
for(int i = 0; i < count; i++)
{
[_allValues addObject:[NSNumber numberWithInt:i]];
}
and again - when we calculate a random value - we add it to the '_buffer' AND remove from '_allValues'
[_buffer insertObject:[NSNumber numberWithInt:RandomIndex] atIndex:0];
[_allValues removeObject:[NSNumber numberWithInt:RandomIndex]];
after that - we again check if _buffer is not larger than 10. If Yes, we remove last object, and add back to _allValues:
if([_buffer count] > 10)
{
[_allValues addObject:[_buffer objectAtIndex:10]];
[_buffer removeObjectAtIndex:10];
}
And most importantly - we calculate 'RandomIndex from the count of _allValues and take corresponding object's intValue:
RandomIndex = [[_allValues objectAtIndex:(arc4random()%[_allValues count])] intValue];
Thus - we don't need any safety checking, because in this way, each time calculated value will be unique for the last 10 moves.
Hope it helps.. happy coding!
I have a global unique path table which can be thought of as a directed un-weighted graph. Each node represents either a piece of physical hardware which is being controlled, or a unique location in the system. The table contains the following for each node:
A unique path ID (int)
Type of component (char - 'A' or 'L')
String which contains a comma separated list of path ID's which that node is connected to (char[])
I need to create a function which given a starting and ending node, finds the shortest path between the two nodes. Normally this is a pretty simple problem, but here is the issue I am having. I have a very limited amount of memory/resources, so I cannot use any dynamic memory allocation (ie a queue/linked list). It would also be nice if it wasn't recursive (but it wouldn't be too big of an issue if it was as the table/graph itself if really small. Currently it has 26 nodes, 8 of which will never be hit. At worst case there would be about 40 nodes total).
I started putting something together, but it doesn't always find the shortest path. The pseudo code is below:
bool shortestPath(int start, int end)
if start == end
if pathTable[start].nodeType == 'A'
Turn on part
end if
return true
else
mark the current node
bool val
for each node in connectedNodes
if node is not marked
val = shortestPath(node.PathID, end)
end if
end for
if val == true
if pathTable[start].nodeType == 'A'
turn on part
end if
return true
end if
end if
return false
end function
Anyone have any ideas how to either fix this code, or know something else that I could use to make it work?
----------------- EDIT -----------------
Taking Aasmund's advice, I looked into implementing a Breadth First Search. Below I have some c# code which I quickly threw together using some pseudo code I found online.
pseudo code found online:
Input: A graph G and a root v of G
procedure BFS(G,v):
create a queue Q
enqueue v onto Q
mark v
while Q is not empty:
t ← Q.dequeue()
if t is what we are looking for:
return t
for all edges e in G.adjacentEdges(t) do
u ← G.adjacentVertex(t,e)
if u is not marked:
mark u
enqueue u onto Q
return none
C# code which I wrote using this code:
public static bool newCheckPath(int source, int dest)
{
Queue<PathRecord> Q = new Queue<PathRecord>();
Q.Enqueue(pathTable[source]);
pathTable[source].markVisited();
while (Q.Count != 0)
{
PathRecord t = Q.Dequeue();
if (t.pathID == pathTable[dest].pathID)
{
return true;
}
else
{
string connectedPaths = pathTable[t.pathID].connectedPathID;
for (int x = 0; x < connectedPaths.Length && connectedPaths != "00"; x = x + 3)
{
int nextNode = Convert.ToInt32(connectedPaths.Substring(x, 2));
PathRecord u = pathTable[nextNode];
if (!u.wasVisited())
{
u.markVisited();
Q.Enqueue(u);
}
}
}
}
return false;
}
This code runs just fine, however, it only tells me if a path exists. That doesn't really work for me. Ideally what I would like to do is in the block "if (t.pathID == pathTable[dest].pathID)" I would like to have either a list or a way to see what nodes I had to pass through to get from the source and destination, such that I can process those nodes there, rather than returning a list to process elsewhere. Any ideas on how i could make that change?
The most effective solution, if you're willing to use static memory allocation (or automatic, as I seem to recall that the C++ term is), is to declare a fixed-size int array (of size 41, if you're absolutely certain that the number of nodes will never exceed 40). By using two indices to indicate the start and end of the queue, you can use this array as a ring buffer, which can act as the queue in a breadth-first search.
Alternatively: Since the number of nodes is so small, Bellman-Ford may be fast enough. The algorithm is simple to implement, does not use recursion, and the required extra memory is only a distance (int, or even byte in your case) and a predecessor id (int) per node. The running time is O(VE), alternatively O(V^3), where V is the number of nodes and E is the number of edges.