avoiding repeats in numbers for a 3x3 grid C++ - c++

I'm trying to make a solver that checks the block to make sure that no number repeats. Unfortunately, I can not get the correct logic on this and I'm not sure what I am doing incorrectly. Here is what I've got:
not quite sure why this is not working. Here's my code.
bool sudoku :: check_if_non_repeat(int r, int c, int v) //where r=row, c=column, v=value
Any idea why this is not working? I'm just getting infinite loops

if (!(j = brow && k == bcol))
Check that j=.... should be ==

I'm not sure what you tried to do, but I would do it like this:
bool sudoku :: check_if_non_repeat(int r, int c, int v) //where r=row, c=column, v=value
{
int brow = r/3;
int bcol = c/3;
for (int j = brow * 3; j < (brow * 3 + 3); j++)
for (int k = bcol * 3; k < (bcol * 3 + 3); k++)
if (sudoku_array[j][k] == v)
return true;
return false;
}
EDIT:
As noted below, the if statement need to be more complicated:
if ( sudoku_array[j][k] == v
&& v != 0
&& !(j == r && k == c))
return true;

I'm about to tell you about a different approach to the problem. I made a full solver a long while ago, and it basically used the opposite approach.
For each field, I had a std::bitset<9> which told me which values were still possible in that field. Each insertion would then update the other fields in the same row, column and box to remove that possibility, recursively filling out subsequent fields when any one of the them had one option left.
If it then tried to fill a number which was no longer allowed, then the last input given was no longer a valid number for that spot. That was also a far more thorough check than you're doing here: you won't be checking if you close off the last possibility for another field in the same row/column/box, let alone others.
I never did a couple planned optimizations, but even without them it outperformed (too quick to notice) my friend's solver (>50 seconds). Mostly because he had code like yours.

Related

Creating the Backtracking Algorithm for n-queen Problem [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I have tried to come up with a solution to the n-queen problem, through backtracking. I have created a board, and I think I have created functions which checks whether a piece can be placed at position column2 or not, in comparison to a piece at position column1. And I guess I somehow want to loop through the columns, to check if the current piece is in a forbidden position to any of the power pieces already placed at the first row through the current minus one. I haven't done this yet, but I'm just confused at the moment, so I can't really see how I should do it.
Let me share the code I have written so far
// Method for creating chessboard
vector<vector<vector<int>>> create_chessboard(int size_of_board)
{
vector<int> v1;
vector<vector<int>> v2;
vector<vector<vector<int>>> v3;
for (int i = 0; i < size_of_board; i++)
{
for (int j = 0; j < size_of_board; j++)
{
v1.clear();
v1.push_back(i);
v1.push_back(j);
v2.push_back(v1);
}
v3.push_back(v2);
v2.clear();
}
return v3;
}
// Method for visualizing chessboard
void visualize_board(vector<vector<vector<int>>> chess, int dimension_of_board)
{
int i = 1;
for (vector<vector<int>> rows : chess)
{
for (int j = 0; j < dimension_of_board; j++)
{
cout << "(" << rows[j][0] << "," << rows[j][1] << ")" << " ";
}
cout << endl;
}
}
// Method for checking if two coordinates are on the same diagonal
bool check_diagonal(vector<int> coordinate1, vector<int> coordinate2)
{
if(abs(coordinate1[1] - coordinate2[1]) == abs(coordinate1[0] - coordinate2[0]))
{
return true;
}
return false;
}
bool check_column(vector<int> coordinate1, vector<int> coordinate2)
{
if(coordinate1[1] == coordinate2[1])
{
return true;
}
return false;
}
bool check_row(vector<int> coordinate1, vector<int> coordinate2)
{
if (coordinate1[0] == coordinate2[0])
{
return true;
}
return false;
}
bool check_allowed_positions(vector<int> coordinate1, vector<int> coordinate2, int column)
{
if (check_diagonal(coordinate1, coordinate2))
{
return false;
}
if (check_column(coordinate1, coordinate2))
{
return false;
}
if (check_row(coordinate1, coordinate2))
{
return false;
}
return true;
}
vector<vector<int>> solve_nqueen(vector<vector<vector<int>>> board, int dimension_of_board, int row)
{
vector<int> first_element = board[0][0];
vector<vector<int>> solution_space;
if (dimension_of_board == row)
{
cout << "we found a solution!";
}
/*
if (dimension_of_board == row)
{
}
for (int j = 0; j < dimension_of_board; j++)
{
if (check_allowed_positions(board, row, j))
{
do something here
solve_nqueen(board, dimension_of_board, row+1);
}
else
{
do something here;
}
}
return;
*/
return solution_space;
}
I would be really happy if someone could just lay up a few steps I have to take in order to build the solve_nqueen function, and maybe some remarks on how I could do that. If I should complement with some further information, just let me know! I'm happy to elaborate.
I hope this isn't a stupid question, but I have been trying to search the internet for a solution. But I didn't manage to use what I found.
Best wishes,
Joel
There is not always a solution, like e.g. not for 2 queens on 2x2 board, or for 3 queens on a 3x3 board.
This is a well-known problem (which can also be found in the internet). According to this, there is not a simple rule or structure, how you can find a solution. In fact, you could reduce the problem by symmetries, but that is not that simple, too.
Well according to this, you have to loop through all (n out of n x n) solutions, and do all tests for every queen. (In fact, reduce it to half again, by only checking a certain pair of queens, once only - but again that is not much, and such reduction takes some time, too).
Note: Your check routines are correct.
For 8 queens on a 8x8 board, write 8 nested loops from i(x)=0 to 63
(row is i(x)%8 and column is i(x)/8). You also need to check then, if a queen does not sit on queen, but your check routines will already find that. Within second nested loop, you can already check if the first two queens are okay, or otherwise, you do not have to go any deeper, but can already increment the value of first nested loop (move the second queen on a new position).
Also it would be nice, I propose not to write the search for a n-problem, but for a n=8 problem or n=7 problem. (That is easier for the beginning.).
Speed-Ups:
While going deeper into the nested loops, you might hold a quick
record (array) of positions which already did not work for upper
loops (still 64 records to check, but could be written to be faster than doing your check routines again).
Or even better, do the inner loops only through a list from remaining candidates, much less than (n x n) positions.
There should be some more options for speed-ups, which you might find.
Final proposal: do not only wait for the full result to come, but also track, when e.g. you find a valid position of 5 queens, then of 6 queens and so on - which will be more fun then (instead of waiting ages with nothing happening).
A further idea is not to loop, e.g. from 0 to 63 for each queen, but "randomly". Which also might lead to more surprising. For this, mix an array 0 .. 63 to a random order. Then, still do the loop from 0 to 63 but this is just the index to the random vector. Al right? Anyway, it would even be more interesting to create 8 random vectors, for each queen one random vector. If you run this program then, anything could happen ... the first few trials could (theoretically) already deliver a successful result.
If you would like to become super efficient, please note that the queen state on the 8x8 board can be stored in one 64-bit-integer variable (64 times '0' or '1' where '1' means here is queen. Keyword: bitboards). But I didn't mention this in the beginning, because the approach which you started is quite different.
And from that on, you could create 64 bit masks for each queen position, to each position to which a queen can go. Then you only need to do 1 "bitwise AND" operation of two (properly defined) 64-bit variables, like a & b, which replaces your (diagonal-, column-, row-) check routines by only one operation and thus is much faster.
Avoid too many function calls, or use inline.
... an endless list of possible dramatic speed-ups: compiler options, parallelization, better algorithms, avoid cache misses (work on a possibly low amount of memory or access memory in a regular way), ... as usual ...
My best answer, e.g. for 8-queen problem:
queen is between 0 .. 7
queen is between 8 .. 15
queen is between 16 .. 23
queen is between 24 .. 31
queen is between 32 .. 39
queen is between 40 .. 47
queen is between 48 .. 55
queen is between 56 .. 63
because all 8 queens have to be on different rows!
These are the limits of the nested loops then, which gives "only"
8 * 8 * 8 * 8 * 8 * 8 * 8 * 8 = 16777216
possibilities to be checked. This can be quick on modern machines.
Then probably you don't need anything more sophisticated (to which my first answer refers - for the 8x8 queens problem.) Anyway, you could still also keep a record of which column is still free, while diving into the nested loops, which yields a further dramatic cut down of checks.
I wrote some C code (similar to C++) to verify my answer. In fact, it is very fast, much less than a second (real 0m0,004s; user 0m0,003s; sys 0m0,001s). The code finds the correct number of 92 solutions for the 8x8 queens problem.
#include <stdio.h>
int f(int a, int b)
{
int r1, c1, r2, c2, d1, d2;
int flag = 1;
r1 = a / 8;
r2 = b / 8;
c1 = a % 8;
c2 = b % 8;
d1 = r1 - r2;
d2 = c1 - c2;
if( d1 == d2 || d1 == -d2 || c1 == c2 ) flag=0;
return flag;
}
int main()
{
int p0,p1, p2, p3, p4, p5, p6, p7;
int solutions=0;
for(p0=0; p0<8; p0++)
{
for(p1=8; p1<16; p1++)
{
if( f(p0,p1) )
for(p2=16; p2<24; p2++)
{
if( f(p0,p2) && f(p1,p2) )
for(p3=24; p3<32; p3++)
{
if( f(p0,p3) && f(p1,p3) && f(p2,p3) )
for(p4=32; p4<40; p4++)
{
if( f(p0,p4) && f(p1,p4) && f(p2,p4) && f(p3,p4))
for(p5=40; p5<48; p5++)
{
if( f(p0,p5) && f(p1,p5) && f(p2,p5) && f(p3,p5) && f(p4,p5) )
for(p6=48; p6<56; p6++)
{
if( f(p0,p6) && f(p1,p6) && f(p2,p6) && f(p3,p6) && f(p4,p6) && f(p5,p6))
for(p7=56; p7<64; p7++)
{
if( f(p0,p7) && f(p1,p7) && f(p2,p7) && f(p3,p7) && f(p4,p7) && f(p5,p7) && f(p6,p7))
{
solutions++;
// 0 .. 63 integer print
printf("%2i %2i %2i %2i %2i %2i %2i %2i\n",
p0,p1,p2,p3,p4,p5,p6,p7);
// a1 .. h8 chess notation print
//printf("%c%d %c%d %c%d %c%d %c%d %c%d %c%d %c%d\n",
//p0%8+'a', p0/8+1, p1%8+'a', p1/8+1, p2%8+'a', p2/8+1, p3%8+'a', p3/8+1,
//p4%8+'a', p4/8+1, p5%8+'a', p5/8+1, p6%8+'a', p6/8+1, p7%8+'a', p7/8+1);
}
}
}
}
}
}
}
}
}
printf("%i solutions have been found\n",solutions);
return 1;
}
Notes: Subroutine f checks if two queen positions are "ok" with each other (1 means true, 0 means false, in C). An inner loop is only entered, if all already selected positions (in outer loops) are "ok" with each other.

Multiply numbers which are divisible by 3 and less than 10 with a while loop in c++?

In C++, I should write a program where the app detects which numbers are divisible by 3 from 1 till 10 and then multiply all of them and print the result. That means that I should multiply 3,6,9 and print only the result, which is 162, but I should do it by using a "While" loop, not just multiplying the 3 numbers with each other. How should I write the code of this? I attached my attempt to code the problem below. Thanks
#include <iostream>
using namespace std;
int main() {
int x, r;
int l;
x = 1;
r = 0;
while (x < 10 && x%3==0) {
r = (3 * x) + 3;
cout << r;
}
cin >> l;
}
Firstly your checking the condition x%3 == 0 brings you out of your while - loop right in the first iteration where x is 1. You need to check the condition inside the loop.
Since you wish to store your answer in variable r you must initialize it to 1 since the product of anything with 0 would give you 0.
Another important thing is you need to increment the value of x at each iteration i.e. to check if each number in the range of 1 to 10 is divisible by 3 or not .
int main()
{
int x, r;
int l;
x = 1;
r = 1;
while (x < 10)
{
if(x%3 == 0)
r = r*x ;
x = x + 1; //incrementing the value of x
}
cout<<r;
}
Lastly I have no idea why you have written the last cin>>l statement . Omit it if not required.
Ok so here are a few hints that hopefully help you solving this:
Your approach with two variables (x and r) outside the loop is a good starting point for this.
Like I wrote in the comments you should use *= instead of your formula (I still don't understand how it is related to the problem)
Don't check if x is dividable by 3 inside the while-check because it would lead to an too early breaking of the loop
You can delete your l variable because it has no affect at the moment ;)
Your output should also happen outside the loop, else it is done everytime the loop runs (in your case this would be 10 times)
I hope I can help ;)
EDIT: Forget about No.4. I didn't saw your comment about the non-closing console.
int main()
{
int result = 1; // "result" is better than "r"
for (int x=1; x < 10; ++x)
{
if (x%3 == 0)
result = result * x;
}
cout << result;
}
or the loop in short with some additional knowledge:
for (int x=3; x < 10; x += 3) // i know that 3 is dividable
result *= x;
or, as it is c++, and for learning purposes, you could do:
vector<int> values; // a container holding integers that will get the multiples of 3
for (int x=1; x < 10; ++x) // as usual
if ( ! x%3 ) // same as x%3 == 0
values.push_back(x); // put the newly found number in the container
// now use a function that multiplies all numbers of the container (1 is start value)
result = std::accumulate(values.begin(), values.end(), 1, multiplies<int>());
// so much fun, also get the sum (0 is the start value, no function needed as add is standard)
int sum = std::accumulate(values.begin(), values.end(), 0);
It's important to remember the difference between = and ==. = sets something to a value while == compares something to a value. You're on the right track with incrementing x and using x as a condition to check your range of numbers. When writing code I usually try and write a "pseudocode" in English to organize my steps and get my logic down. It's also wise to consider using variables that tell you what they are as opposed to just random letters. Imagine if you were coding a game and you just had letters as variables; it would be impossible to remember what is what. When you are first learning to code this really helps a lot. So with that in mind:
/*
- While x is less than 10
- check value to see if it's mod 3
- if it's mod 3 add it to a sum
- if not's mod 3 bump a counter
- After my condition is met
- print to screen pause screen
*/
Now if we flesh out that pseudocode a little more we'll get a skeletal structure.
int main()
{
int x=1//value we'll use as a counter
int sum=0//value we'll use as a sum to print out at the end
while(x<10)//condition we'll check against
{
if (x mod 3 is zero)
{
sum=x*1;
increment x
}
else
{
increment x
}
}
//screen output the sum the sum
//system pause or cin.get() use whatever your teacher gave you.
I've given you a lot to work with here you should be able to figure out what you need from this. Computer Science and programming is hard and will require a lot of work. It's important to develop good coding habits and form now as it will help you in the future. Coding is a skill like welding; the more you do it the better you'll get. I often refer to it as the "Blue Collar Science" because it's really a skillset and not just raw knowledge. It's not like studying history or Biology (minus Biology labs) because those require you to learn things and loosely apply them whereas programming requires you to actually build something. It's like welding or plumbing in my opinion.
Additionally when you come to sites like these try and read up how things should be posted and try and seek the "logic" behind the answer and come up with it on your own as opposed to asking for the answer. People will be more inclined to help you if they think you're working for something instead of asking for a handout (not saying you are, just some advice). Additionally take the attitude these guys give you with a grain of salt, Computer Scientists aren't known to be the worlds most personable people. =) Good luck.

Solving "Welcome to Code Jam" from Google Code Jam 2009

I am trying to solve the following code jam question,ive made some progress but for few cases my code give wrong outputs..
Welcome to Code jam
So i stumbled on a solution by dev "rem" from russia.
I've no idea how his/her solution is working correctly.. the code...
const string target = "welcome to code jam";
char buf[1<<20];
int main() {
freopen("input.txt", "rt", stdin);
freopen("output.txt", "wt", stdout);
gets(buf);
FOR(test, 1, atoi(buf)) {
gets(buf);
string s(buf);
int n = size(s);
int k = size(target);
vector<vector<int> > dp(n+1, vector<int>(k+1));
dp[0][0] = 1;
const int mod = 10000;
assert(k == 19);
REP(i, n) REP(j, k+1) {// Whats happening here
dp[i+1][j] = (dp[i+1][j]+dp[i][j])%mod;
if (j < k && s[i] == target[j])
dp[i+1][j+1] = (dp[i+1][j+1]+dp[i][j])%mod;
}
printf("Case #%d: %04d\n", test, dp[n][k]);
}
exit(0);
}//credit rem
Can somebody explain whats happening in the two loops?
Thanks.
What he is doing: dynamic programming, this far you can see too.
He has 2D array and you need to understand what is its semantics.
The fact is that dp[i][j] counts the number of ways he can get a subsequence of the first j letters of welcome to code jam using all the letters in the input string upto the ith index. Both indexes are 1 -based to allow for the case of not taking any letters from the strings.
For example if the input is:
welcome to code jjam
The values of dp in different situations are going to be:
dp[1][1] = 1; // first letter is w. perfect just the goal
dp[1][2] = 0; // no way to have two letters in just one-letter string
dp[2][2] = 1; // again: perfect
dp[1][2] = 1; // here we ignore the e. We just need the w.
dp[7][2] = 2; // two ways to construct we: [we]lcome and [w]elcom[e].
The loop you are specifically asking about calculates new dynamic values based on the already calculated ones.
Whoa, I was practicing this problem few days ago and and stumbled across this question.
I suspect that saying "he's doing dynamic programming" won't not explain too much if you did not study DP.
I can give clearer implementation and easier explanation:
string phrase = "welcome to code jam"; // S
string text; getline(cin, text); // T
vector<int> ob(text.size(), 1);
int ans = 0;
for (int p = 0; p < phrase.size(); ++p) {
ans = 0;
for (int i = 0; i < text.size(); ++i) {
if (text[i] == phrase[p]) ans = (ans + ob[i]) % 10000;
ob[i] = ans;
}
}
cout << setfill('0') << setw(4) << ans << endl;
To solve the problem if S had only one character S[0] we could just count number of its occurrences.
If it had only two characters S[0..1] we see that each occurrence T[i]==S[1] increases answer by the number of occurrences of S[0] before index i.
For three characters S[0..2] each occurrence T[i]==S[2] similarly increases answer by number of occurrences of S[0..1] before index i. This number is the same as the answer value at the moment the previous paragraph had processed T[i].
If there were four characters, the answer would be increasing by number of occurrences of the previous three before each index at which fourth character is found, and so on.
As every other step uses values from the previous ones, this can be solved incrementally. On each step p we need to know number of occurrences of previous substring S[0..p-1] before any index i, which can be kept in array of integers ob of the same length as T. Then the answer goes up by ob[i] whenever we encounter S[p] at i. And to prepare ob for the next step, we also update each ob[i] to be the number of occurrences of S[0..p] instead — i.e. to the current answer value.
By the end the latest answer value (and the last element of ob) contain the number of occurrences of whole S in whole T, and that is the final answer.
Notice that it starts with ob filled with ones. The first step is different from the rest; but counting number of occurrences of S[0] means increasing answer by 1 on each occurrence, which is what all other steps do, except that they increase by ob[i]. So when every ob[i] is initially 1, the first step will run just like all others, using the same code.

Wildcard String Search Algorithm

In my program I need to search in a quite big string (~1 mb) for a relatively small substring (< 1 kb).
The problem is the string contains simple wildcards in the sense of "a?c" which means I want to search for strings like "abc" or also "apc",... (I am only interested in the first occurence).
Until now I use the trivial approach (here in pseudocode)
algorithm "search", input: haystack(string), needle(string)
for(i = 0, i < length(haystack), ++i)
if(!CompareMemory(haystack+i,needle,length(needle))
return i;
return -1; (Not found)
Where "CompareMemory" returns 0 iff the first and second argument are identical (also concerning wildcards) only regarding the amount of bytes the third argument gives.
My question is now if there is a fast algorithm for this (you don't have to give it, but if you do I would prefer c++, c or pseudocode). I started here
but I think most of the fast algorithms don't allow wildcards (by the way they exploit the nature of strings).
I hope the format of the question is ok because I am new here, thank you in advance!
A fast way, which is kind of the same thing as using a regexp, (which I would recommend anyway), is to find something that is fixed in needle, "a", but not "?", and search for it, then see if you've got a complete match.
j = firstNonWildcardPos(needle)
for(i = j, i < length(haystack)-length(needle)+j, ++i)
if(haystack[i] == needle[j])
if(!CompareMemory(haystack+i-j,needle,length(needle))
return i;
return -1; (Not found)
A regexp would generate code similar to this (I believe).
Among strings over an alphabet of c characters, let S have length s and let T_1 ... T_k have average length b. S will be searched for each of the k target strings. (The problem statement doesn't mention multiple searches of a given string; I mention it below because in that paradigm my program does well.)
The program uses O(s+c) time and space for setup, and (if S and the T_i are random strings) O(k*u*s/c) + O(k*b + k*b*s/c^u) total time for searching, with u=3 in program as shown. For longer targets, u should be increased, and rare, widely-separated key characters chosen.
In step 1, the program creates an array L of s+TsizMax integers (in program, TsizMax = allowed target length) and uses it for c lists of locations of next occurrences of characters, with list heads in H[] and tails in T[]. This is the O(s+c) time and space step.
In step 2, the program repeatedly reads and processes target strings. Step 2A chooses u = 3 different non-wild key characters (in current target). As shown, the program just uses the first three such characters; with a tiny bit more work, it could instead use the rarest characters in the target, to improve performance. Note, it doesn't cope with targets with fewer than three such characters.
The line "L[T[r]] = L[g+i] = g+i;" within Step 2A sets up a guard cell in L with proper delta offset so that Step 2G will automatically execute at end of search, without needing any extra testing during the search. T[r] indexes the tail cell of the list for character r, so cell L[g+i] becomes a new, self-referencing, end-of-list for character r. (This technique allows the loops to run with a minimum of extraneous condition testing.)
Step 2B sets vars a,b,c to head-of-list locations, and sets deltas dab, dac, and dbc corresponding to distances between the chosen key characters in target.
Step 2C checks if key characters appear in S. This step is necessary because otherwise a while loop in Step 2E will hang. We don't want more checks within those while loops because they are the inner loops of search.
Step 2D does steps 2E to 2i until var c points to after end of S, at which point it is impossible to make any more matches.
Step 2E consists of u = 3 while loops, that "enforce delta distances", that is, crawl indexes a,b,c along over each other as long as they are not pattern-compatible. The while loops are fairly fast, each being in essence (with ++si instrumentation removed) "while (v+d < w) v = L[v]" for various v, d, w. Replicating the three while loops a few times may increase performance a little and will not change net results.
In Step 2G, we know that the u key characters match, so we do a complete compare of target to match point, with wild-character handling. Step 2H reports result of compare. Program as given also reports non-matches in this section; remove that in production.
Step 2I advances all the key-character indexes, because none of the currently-indexed characters can be the key part of another match.
You can run the program to see a few operation-count statistics. For example, the output
Target 5=<de?ga>
012345678901234567890123456789012345678901
abc1efgabc2efgabcde3gabcdefg4bcdefgabc5efg
# 17, de?ga and de3ga match
# 24, de?ga and defg4 differ
# 31, de?ga and defga match
Advances: 'd' 0+3 'e' 3+3 'g' 3+3 = 6+9 = 15
shows that Step 2G was entered 3 times (ie, the key characters matched 3 times); the full compare succeeded twice; step 2E while loops advanced indexes 6 times; step 2I advanced indexes 9 times; there were 15 advances in all, to search the 42-character string for the de?ga target.
/* jiw
$Id: stringsearch.c,v 1.2 2011/08/19 08:53:44 j-waldby Exp j-waldby $
Re: Concept-code for searching a long string for short targets,
where targets may contain wildcard characters.
The user can enter any number of targets as command line parameters.
This code has 2 long strings available for testing; if the first
character of the first parameter is '1' the jay[42] string is used,
else kay[321].
Eg, for tests with *hay = jay use command like
./stringsearch 1e?g a?cd bc?e?g c?efg de?ga ddee? ddee?f
or with *hay = kay,
./stringsearch bc?e? jih? pa?j ?av??j
to exercise program.
Copyright 2011 James Waldby. Offered without warranty
under GPL v3 terms as at http://www.gnu.org/licenses/gpl.html
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <limits.h>
//================================================
int main(int argc, char *argv[]) {
char jay[]="abc1efgabc2efgabcde3gabcdefg4bcdefgabc5efg";
char kay[]="ludehkhtdiokihtmaihitoia1htkjkkchajajavpajkihtijkhijhipaja"
"etpajamhkajajacpajihiatokajavtoia2pkjpajjhiifakacpajjhiatkpajfojii"
"etkajamhpajajakpajihiatoiakavtoia3pakpajjhiifakacpajjhkatvpajfojii"
"ihiifojjjjhijpjkhtfdoiajadijpkoia4jihtfjavpapakjhiifjpajihiifkjach"
"ihikfkjjjjhijpjkhtfdoiajakijptoik4jihtfjakpapajjkiifjpajkhiifajkch";
char *hay = (argc>1 && argv[1][0]=='1')? jay:kay;
enum { chars=1<<CHAR_BIT, TsizMax=40, Lsiz=TsizMax+sizeof kay, L1, L2 };
int L[L2], H[chars], T[chars], g, k, par;
// Step 1. Make arrays L, H, T.
for (k=0; k<chars; ++k) H[k] = T[k] = L1; // Init H and T
for (g=0; hay[g]; ++g) { // Make linked character lists for hay.
k = hay[g]; // In same loop, could count char freqs.
if (T[k]==L1) H[k] = T[k] = g;
T[k] = L[T[k]] = g;
}
// Step 2. Read and process target strings.
for (par=1; par<argc; ++par) {
int alpha[3], at[3], a=g, b=g, c=g, da, dab, dbc, dac, i, j, r;
char * targ = argv[par];
enum { wild = '?' };
int sa=0, sb=0, sc=0, ta=0, tb=0, tc=0;
printf ("Target %d=<%s>\n", par, targ);
// Step 2A. Choose 3 non-wild characters to follow.
// As is, chooses first 3 non-wilds for a,b,c.
// Could instead choose 3 rarest characters.
for (j=0; j<3; ++j) alpha[j] = -j;
for (i=j=0; targ[i] && j<3; ++i)
if (targ[i] != wild) {
r = alpha[j] = targ[i];
if (alpha[0]==alpha[1] || alpha[1]==alpha[2]
|| alpha[0]==alpha[2]) continue;
at[j] = i;
L[T[r]] = L[g+i] = g+i;
++j;
}
if (j != 3) {
printf (" Too few target chars\n");
continue;
}
// Step 2B. Set a,b,c to head-of-list locations, set deltas.
da = at[0];
a = H[alpha[0]]; dab = at[1]-at[0];
b = H[alpha[1]]; dbc = at[2]-at[1];
c = H[alpha[2]]; dac = at[2]-at[0];
// Step 2C. See if key characters appear in haystack
if (a >= g || b >= g || c >= g) {
printf (" No match on some character\n");
continue;
}
for (g=0; hay[g]; ++g) printf ("%d", g%10);
printf ("\n%s\n", hay); // Show haystack, for user aid
// Step 2D. Search for match
while (c < g) {
// Step 2E. Enforce delta distances
while (a+dab < b) {a = L[a]; ++sa; } // Replicate these
while (b+dbc < c) {b = L[b]; ++sb; } // 3 abc lines as many
while (a+dac > c) {c = L[c]; ++sc; } // times as you like.
while (a+dab < b) {a = L[a]; ++sa; } // Replicate these
while (b+dbc < c) {b = L[b]; ++sb; } // 3 abc lines as many
while (a+dac > c) {c = L[c]; ++sc; } // times as you like.
// Step 2F. See if delta distances were met
if (a+dab==b && b+dbc==c && c<g) {
// Step 2G. Yes, so we have 3-letter-match and need to test whole match.
r = a-da;
for (k=0; targ[k]; ++k)
if ((hay[r+k] != targ[k]) && (targ[k] != wild))
break;
printf ("# %3d, %s and ", r, targ);
for (i=0; targ[i]; ++i) putchar(hay[r++]);
// Step 2H. Report match, if found
puts (targ[k]? " differ" : " match");
// Step 2I. Advance all of a,b,c, to go on looking
a = L[a]; ++ta;
b = L[b]; ++tb;
c = L[c]; ++tc;
}
}
printf ("Advances: '%c' %d+%d '%c' %d+%d '%c' %d+%d = %d+%d = %d\n",
alpha[0], sa,ta, alpha[1], sb,tb, alpha[2], sc,tc,
sa+sb+sc, ta+tb+tc, sa+sb+sc+ta+tb+tc);
}
return 0;
}
Note, if you like this answer better than current preferred answer, unmark that one and mark this one. :)
Regular expressions usually use a finite state automation-based search, I think. Try implementing that.

C++ code translation and explanation

I have the following c++ code snippet. I have a basic understanding of c++ code.Please correct my explanation of the following code where ever necessary:
for (p = q->prnmsk, s = savedx->msk, j = sizeof(q->prnmsk);
j && !(*p & *s); j--, p++, s++);
What does it contain: q is char *q(as declared) is type of structure MSK as per code.
q->prnmsk contains byte data where prnmask containd 15 bytes.
It is similar for s.
So in the for loop as j decreases it will go through each byte and perform this !(*p & *s) operation to continue the loop and eventually if the condition is not met the loop will exit else j will run till j==0.
Am I correct? What does *p and *s mean? Will it contain the byte value?
Some (like me) might think that following is more readable
int j;
for (j = 0; j < sizeof(q->prnmsk); ++j)
{
if ((q->prnmsk[j] & savedx->msk[j]) != 0) break;
}
which would mean that q->prnmsk and savedx->msk are iterated to find the first occurence of where bit-anding both is not zero. if j equals sizeof(q->prnmsk), all bit-andings were zero.
Yes, you are right. !(*p & *s) means that they want to check if q->prnmsk and savedx->msk don't have corresponding bits set to 1 simultaneously.