Creating a DFA at run time. How many states?

Creating a DFA at run time. How many states? - c++

I am currently working on a program that will take a English description of a language and then use the description to create a DFA for those specs. I allow certain operations, example {w | w has the sub string 01 in the beginning} and other options such as even odd sub string, more less or exactly than k sub string, etc. The user also chooses the alphabet.
My question is how would i know how many states I will need? since the user gives me my alphabet and rules I don't know anything until run time. I have created DFA's/transition tables before, but in those cases I knew what my DFA was and could declare it in a class or have it static. Should I be using the 5 tupil (Q, ∑, δ, q0, F)? or taking a different approach? Any help wrapping my head around the problem is appreciated.

My question is how would i know how many states I will need?
You won't.
You can represent the transition function as std::unordered_map of pairs of (q, σ) ∈ (Q, Σ) to q ∈ Q
using state = int;
using symbol = char;
using tranFn = std::unordered_map<std::pair<state, symbol>, state>;
// sets up transition function, the values can be read at runtime
tranFn fn;
fn[{0, 'a'}] = 1;
fn[{0, 'b'}] = 2;
fn[{1, 'a'}] = 1;
fn[{1, 'b'}] = 0;
fn[{2, 'a'}] = 2;
fn[{2, 'b'}] = 2;
// run the dfa
state s = 0;
symbol sym;
while(std::cin >> sym)
s = fn[{s, sym}];
std::cout << s;
btw, if 1 is the accepting state in the dfa above, it accepts exactly a(a + ba)*

Related

How to use a string or a char vector (containing any chemical composition respectively formula) and calculate its molar mass?

I try to write a simple console application in C++ which can read any chemical formula and afterwards compute its molar mass, for example:
Na2CO3, or something like:
La0.6Sr0.4CoO3, or with brackets:
Fe(NO3)3
The problem is that I don't know in detail how I can deal with the input stream. I think that reading the input and storing it into a char vector may be in this case a better idea than utilizing a common string.
My very first idea was to check all elements (stored in a char vector), step by step: When there's no lowercase after a capital letter, then I have found e.g. an element like Carbon 'C' instead of "Co" (Cobalt) or "Cu" (Copper). Basically, I've tried with the methods isupper(...), islower(...) or isalpha(...).
// first idea, but it seems to be definitely the wrong way
// read input characters from char vector
// check if element contains only one or two letters
// ... and convert them to a string, store them into a new vector
// ... finally, compute the molar mass elsewhere
// but how to deal with the numbers... ?
for (unsigned int i = 0; i < char_vec.size()-1; i++)
{
if (islower(char_vec[i]))
{
char arr[] = { char_vec[i - 1], char_vec[i] };
string temp_arr(arr, sizeof(arr));
element.push_back(temp_arr);
}
else if (isupper(char_vec[i]) && !islower(char_vec[i+1]))
{
char arrSec[] = { char_vec[i] };
string temp_arrSec(arrSec, sizeof(arrSec));
element.push_back(temp_arrSec);
}
else if (!isalpha(char_vec[i]) || char_vec[i] == '.')
{
char arrNum[] = { char_vec[i] };
string temp_arrNum(arrNum, sizeof(arrNum));
stoechiometr_num.push_back(temp_arrNum);
}
}
I need a simple algorithm which can handle with letters and numbers. There also may be the possibility working with pointer, but currently I am not so familiar with this technique. Anyway I am open to that understanding in case someone would like to explain to me how I could use them here.
I would highly appreciate any support and of course some code snippets concerning this problem, since I am thinking for many days about it without progress… Please keep in mind that I am rather a beginner than an intermediate.

This problem is surely not for a beginner but I will try to give you some idea about how you can do that.
Assumption: I am not considering Isotopes case in which atomic mass can be different with same atomic number.
Model it to real world.
How will you solve that in real life?
Say, if I give you Chemical formula: Fe(NO3)3, What you will do is:
Convert this to something like this:
Total Mass => [1 of Fe] + [3 of NO3] => [1 of Fe] + [ 3 of [1 of N + 3 of O ] ]
=> 1 * Fe + 3 * (1 * N + 3 * O)
Then, you will search for individual masses of elements and then substitute them.
Total Mass => 1 * 56 + 3 * (1 * 14 + 3 * 16)
=> 242
Now, come to programming.
Trust me, you have to do the same in programming also.
Convert your chemical formula to the form discussed above i.e. Convert Fe(NO3)3 to Fe*1+(N*1+O*3)*3. I think this is the hardest part in this problem. But it can be done also by breaking down into steps.
Check if all the elements have number after it. If not, then add "1" after it. For example, in this case, O has a number after it which is 3. But Fe and N doesn't have it.
After this step, your formula should change to Fe1(N1O3)3.
Now, Convert each number, say num of above formula to:
*num+ If there is some element after current number.
*num If you encountered ')' or end of formula after it.
After this, your formula should change to Fe*1+(N*1+O*3)*3.
Now, your problem is to solve the above formula. There is a very easy algorithm for this. Please refer to: https://www.geeksforgeeks.org/expression-evaluation/. In your case, your operands can be either a number (say 2) or an element (say Fe). Your operators can be * and +. Parentheses can also be present.
For finding individual masses, you may maintain a std::map<std::string, int> containing element name as key and its mass as value.
Hope this helps a bit.

using floating point arithmetic with Z3 C++ APIs

I am trying to solve a problem of nonlinear real numbers using Z3. I need the Z3 to generate multiple solutions.
In the problem domain, precision is not a critical issue; I need just one or two decimal digits after the decimal point. so, I need to set Z3 not to explore all the search space of real numbers to minimize the time to find multiple solutions.
I am trying to replace the real numbers with floating point numbers. I read the fpa example in the c_api.c file but I found it a little bit confusing for me.
for example, let me assume that I want to convert the reals in the following code:
config cfg;
cfg.set("auto_config", true);
context con(cfg);
expr x = con.real_const("x");
expr y = con.real_const("y");
solver sol(con);
sol.add(x*y > 10);
std::cout << sol.check() << "\n";
std::cout << sol.get_model() << "\n";
}
I tried the following code but it didn't work
config cfg;
cfg.set("auto_config", true);
context con(cfg);
expr sign = con.bv_const("sig", 1);
expr exp = con.bv_const("exp", 10);
expr sig = con.bv_const("sig", 10);
expr x = to_expr(con, Z3_mk_fpa_fp(con, sign, exp, sig));
expr y = to_expr(con, Z3_mk_fpa_fp(con, sign, exp, sig));
solver sol(con);
sol.add(x*y > 10);
std::cout << sol.check() << "\n";
and the output is:
Assertion failed: false, file c:\users\rehab\downloads\z3-master\z3-master\src\a
pi\c++\z3++.h, line 1199
My questions are:
Are there any detailed examples or code snippets about using fpa in C++ APIs? it is not clear to me how to convert the fpa example in the C API to C++ API.
What's wrong in the above code conversion?

I'm not sure if using floats is the best way to go for your problem. But sounds like you tried all other options and non-linearity is getting in your way. Note that even if you model your problem with floats, floating-point arithmetic is quite tricky and solver may have hard time finding satisfying models. Furthermore, solutions maybe way far off from actual results due to numerical instability.
Using C
Leaving all those aside, the correct way to code your query using the C api would be (assuming we use 32-bit single-precision floats):
#include <z3.h>
int main(void) {
Z3_config cfg = Z3_mk_config();
Z3_context ctx = Z3_mk_context(cfg);
Z3_solver s = Z3_mk_solver(ctx);
Z3_solver_inc_ref(ctx, s);
Z3_del_config(cfg);
Z3_sort float_sort = Z3_mk_fpa_sort(ctx, 8, 24);
Z3_symbol s_x = Z3_mk_string_symbol(ctx, "x");
Z3_symbol s_y = Z3_mk_string_symbol(ctx, "y");
Z3_ast x = Z3_mk_const(ctx, s_x, float_sort);
Z3_ast y = Z3_mk_const(ctx, s_y, float_sort);
Z3_symbol s_x_times_y = Z3_mk_string_symbol(ctx, "x_times_y");
Z3_ast x_times_y = Z3_mk_const(ctx, s_x_times_y, float_sort);
Z3_ast c1 = Z3_mk_eq(ctx, x_times_y, Z3_mk_fpa_mul(ctx, Z3_mk_fpa_rne(ctx), x, y));
Z3_ast c2 = Z3_mk_fpa_gt(ctx, x_times_y, Z3_mk_fpa_numeral_float(ctx, 10, float_sort));
Z3_solver_assert(ctx, s, c1);
Z3_solver_assert(ctx, s, c2);
Z3_lbool result = Z3_solver_check(ctx, s);
switch(result) {
case Z3_L_FALSE: printf("unsat\n");
break;
case Z3_L_UNDEF: printf("undef\n");
break;
case Z3_L_TRUE: { Z3_model m = Z3_solver_get_model(ctx, s);
if(m) Z3_model_inc_ref(ctx, m);
printf("sat\n%s\n", Z3_model_to_string(ctx, m));
break;
}
}
return 0;
}
When run, this prints:
sat
x_times_y -> (fp #b0 #xbe #b10110110110101010000010)
y -> (fp #b0 #xb5 #b00000000000000000000000)
x -> (fp #b0 #x88 #b10110110110101010000010)
These are single-precision floating point numbers; you can read about them in wikipedia for instance. In more conventional notation, they are:
x_times_y -> 1.5810592e19
y -> 1.8014399e16
x -> 877.6642
This is quite tricky to use, but what you have asked.
Using Python
I'd heartily recommend using the Python API to at least see what the solver is capable of before investing into such complicated C code. Here's how it would look in Python:
from z3 import *
x = FP('x', FPSort(8, 24))
y = FP('y', FPSort(8, 24))
s = Solver()
s.add(x*y > 10);
s.check()
print s.model()
When run, this prints:
[y = 1.32167303562164306640625,
x = 1.513233661651611328125*(2**121)]
Perhaps not what you expected, but it is a valid model indeed.
Using Haskell
Just to give you a taste of simplicity, here's how the same problem can be expressed using the Haskell bindings (It's just a mere one liner!)
Prelude Data.SBV> sat $ \x y -> fpIsPoint x &&& fpIsPoint y &&& x * y .> (10::SFloat)
Satisfiable. Model:
s0 = 5.1129496e28 :: Float
s1 = 6.6554557e9 :: Float
Summary
Note that Floating-point also has issues regarding NaN/Infinity values, so you might have to avoid those explicitly. (This is what the Haskell expression did by using the isFPPoint predicate. Coding it in Python or C would require more code, but is surely doable.)
It should be emphasized that literally any other binding to Z3 (Python, Haskell, Scala, what have you) will give you a better experience than what you'll get with C/C++/Java. (Even direct coding in SMTLib would be nicer.)
So, I heartily recommend using some higher-level interface (Python is a good one: It is easy to learn), and once you are confident with the model and how it works, you can then start coding the same in C if necessary.

Attach a string to every group of substrings separated by a set of symbols

Imagine I have string
Newton = 'kg*m/s^2'
and I need it to be:
NewtonMupad = 'unit::kg*unit::m/unit::s^2'
Is there a simple way to detect every physical unit and attach unit:: to it? It can be assumed every unit is seperated by either /, * or an exponent ^2 or ^3.
For now I used several regular expressions, like
x = regexp(Newton ,'*','split')
y = regexp(Newton ,'/','split')
z = regexp(Newton ,'^','split')
and I'm able to create the string I need with a loop. But I wonder if there is any simpler and faster solution using Matlab?

You can use regexprep:
>> Newton = 'kg*m/s^2'
>> regexprep(Newton,'(([a-zA-Z]+)(*|/|\^|$))', 'unit::$1')
ans =
unit::kg*unit::m/unit::s^2

Solving a linear equation in one variable

What would be the most efficient algorithm to solve a linear equation in one variable given as a string input to a function? For example, for input string:
"x + 9 – 2 - 4 + x = – x + 5 – 1 + 3 – x"
The output should be 1.
I am considering using a stack and pushing each string token onto it as I encounter spaces in the string. If the input was in polish notation then it would have been easier to pop numbers off the stack to get to a result, but I am not sure what approach to take here.
It is an interview question.

Solving the linear equation is (I hope) extremely easy for you once you've worked out the coefficients a and b in the equation a * x + b = 0.
So, the difficult part of the problem is parsing the expression and "evaluating" it to find the coefficients. Your example expression is extremely simple, it uses only the operators unary -, binary -, binary +. And =, which you could handle specially.
It is not clear from the question whether the solution should also handle expressions involving binary * and /, or parentheses. I'm wondering whether the interview question is intended:
to make you write some simple code, or
to make you ask what the real scope of the problem is before you write anything.
Both are important skills :-)
It could even be that the question is intended:
to separate those with lots of experience writing parsers (who will solve it as fast as they can write/type) from those with none (who might struggle to solve it at all within a few minutes, at least without some hints).
Anyway, to allow for future more complicated requirements, there are two common approaches to parsing arithmetic expressions: recursive descent or Dijkstra's shunting-yard algorithm. You can look these up, and if you only need the simple expressions in version 1.0 then you can use a simplified form of Dijkstra's algorithm. Then once you've parsed the expression, you need to evaluate it: use values that are linear expressions in x and interpret = as an operator with lowest possible precedence that means "subtract". The result is a linear expression in x that is equal to 0.
If you don't need complicated expressions then you can evaluate that simple example pretty much directly from left-to-right once you've tokenised it[*]:
x
x + 9
// set the "we've found minus sign" bit to negate the first thing that follows
x + 7 // and clear the negative bit
x + 3
2 * x + 3
// set the "we've found the equals sign" bit to negate everything that follows
3 * x + 3
3 * x - 2
3 * x - 1
3 * x - 4
4 * x - 4
Finally, solve a * x + b = 0 as x = - b/a.
[*] example tokenisation code, in Python:
acc = None
for idx, ch in enumerate(input):
if ch in '1234567890':
if acc is None: acc = 0
acc = 10 * acc + int(ch)
continue
if acc != None:
yield acc
acc = None
if ch in '+-=x':
yield ch
elif ch == ' ':
pass
else:
raise ValueError('illegal character "%s" at %d' % (ch, idx))
Alternative example tokenisation code, also in Python, assuming there will always be spaces between tokens as in the example. This leaves token validation to the parser:
return input.split()

ok some simple psuedo code that you could use to solve this problem
function(stinrgToParse){
arrayoftokens = stringToParse.match(RegexMatching);
foreach(arrayoftokens as token)
{
//now step through the tokens and determine what they are
//and store the neccesary information.
}
//Use the above information to do the arithmetic.
//count the number of times a variable appears positive and negative
//do the arithmetic.
//add up the numbers both positive and negative.
//return the result.
}

The first thing is to parse the string, to identify the various tokens (numbers, variables and operators), so that an expression tree can be formed by giving operator proper precedences.
Regular expressions can help, but that's not the only method (grammar parsers like boost::spirit are good too, and you can even run your own: its all a "find and recourse").
The tree can then be manipulated reducing the nodes executing those operation that deals with constants and by grouping variables related operations, executing them accordingly.
This goes on recursively until you remain with a variable related node and a constant node.
At the point the solution is calculated trivially.
They are basically the same principles that leads to the production of an interpreter or a compiler.

Consider:
from operator import add, sub
def ab(expr):
a, b, op = 0, 0, add
for t in expr.split():
if t == '+': op = add
elif t == '-': op = sub
elif t == 'x': a = op(a, 1)
else : b = op(b, int(t))
return a, b
Given an expression like 1 + x - 2 - x... this converts it to a canonical form ax+b and returns a pair of coefficients (a,b).
Now, let's obtain the coefficients from both parts of the equation:
le, ri = equation.split('=')
a1, b1 = ab(le)
a2, b2 = ab(ri)
and finally solve the trivial equation a1*x + b1 = a2*x + b2:
x = (b2 - b1) / (a1 - a2)
Of course, this only solves this particular example, without operator precedence or parentheses. To support the latter you'll need a parser, presumable a recursive descent one, which would be simper to code by hand.

Wildcard String Search Algorithm

In my program I need to search in a quite big string (~1 mb) for a relatively small substring (< 1 kb).
The problem is the string contains simple wildcards in the sense of "a?c" which means I want to search for strings like "abc" or also "apc",... (I am only interested in the first occurence).
Until now I use the trivial approach (here in pseudocode)
algorithm "search", input: haystack(string), needle(string)
for(i = 0, i < length(haystack), ++i)
if(!CompareMemory(haystack+i,needle,length(needle))
return i;
return -1; (Not found)
Where "CompareMemory" returns 0 iff the first and second argument are identical (also concerning wildcards) only regarding the amount of bytes the third argument gives.
My question is now if there is a fast algorithm for this (you don't have to give it, but if you do I would prefer c++, c or pseudocode). I started here
but I think most of the fast algorithms don't allow wildcards (by the way they exploit the nature of strings).
I hope the format of the question is ok because I am new here, thank you in advance!

A fast way, which is kind of the same thing as using a regexp, (which I would recommend anyway), is to find something that is fixed in needle, "a", but not "?", and search for it, then see if you've got a complete match.
j = firstNonWildcardPos(needle)
for(i = j, i < length(haystack)-length(needle)+j, ++i)
if(haystack[i] == needle[j])
if(!CompareMemory(haystack+i-j,needle,length(needle))
return i;
return -1; (Not found)
A regexp would generate code similar to this (I believe).

Among strings over an alphabet of c characters, let S have length s and let T_1 ... T_k have average length b. S will be searched for each of the k target strings. (The problem statement doesn't mention multiple searches of a given string; I mention it below because in that paradigm my program does well.)
The program uses O(s+c) time and space for setup, and (if S and the T_i are random strings) O(k*u*s/c) + O(k*b + k*b*s/c^u) total time for searching, with u=3 in program as shown. For longer targets, u should be increased, and rare, widely-separated key characters chosen.
In step 1, the program creates an array L of s+TsizMax integers (in program, TsizMax = allowed target length) and uses it for c lists of locations of next occurrences of characters, with list heads in H[] and tails in T[]. This is the O(s+c) time and space step.
In step 2, the program repeatedly reads and processes target strings. Step 2A chooses u = 3 different non-wild key characters (in current target). As shown, the program just uses the first three such characters; with a tiny bit more work, it could instead use the rarest characters in the target, to improve performance. Note, it doesn't cope with targets with fewer than three such characters.
The line "L[T[r]] = L[g+i] = g+i;" within Step 2A sets up a guard cell in L with proper delta offset so that Step 2G will automatically execute at end of search, without needing any extra testing during the search. T[r] indexes the tail cell of the list for character r, so cell L[g+i] becomes a new, self-referencing, end-of-list for character r. (This technique allows the loops to run with a minimum of extraneous condition testing.)
Step 2B sets vars a,b,c to head-of-list locations, and sets deltas dab, dac, and dbc corresponding to distances between the chosen key characters in target.
Step 2C checks if key characters appear in S. This step is necessary because otherwise a while loop in Step 2E will hang. We don't want more checks within those while loops because they are the inner loops of search.
Step 2D does steps 2E to 2i until var c points to after end of S, at which point it is impossible to make any more matches.
Step 2E consists of u = 3 while loops, that "enforce delta distances", that is, crawl indexes a,b,c along over each other as long as they are not pattern-compatible. The while loops are fairly fast, each being in essence (with ++si instrumentation removed) "while (v+d < w) v = L[v]" for various v, d, w. Replicating the three while loops a few times may increase performance a little and will not change net results.
In Step 2G, we know that the u key characters match, so we do a complete compare of target to match point, with wild-character handling. Step 2H reports result of compare. Program as given also reports non-matches in this section; remove that in production.
Step 2I advances all the key-character indexes, because none of the currently-indexed characters can be the key part of another match.
You can run the program to see a few operation-count statistics. For example, the output
Target 5=<de?ga>
012345678901234567890123456789012345678901
abc1efgabc2efgabcde3gabcdefg4bcdefgabc5efg
# 17, de?ga and de3ga match
# 24, de?ga and defg4 differ
# 31, de?ga and defga match
Advances: 'd' 0+3 'e' 3+3 'g' 3+3 = 6+9 = 15
shows that Step 2G was entered 3 times (ie, the key characters matched 3 times); the full compare succeeded twice; step 2E while loops advanced indexes 6 times; step 2I advanced indexes 9 times; there were 15 advances in all, to search the 42-character string for the de?ga target.
/* jiw
$Id: stringsearch.c,v 1.2 2011/08/19 08:53:44 j-waldby Exp j-waldby $
Re: Concept-code for searching a long string for short targets,
where targets may contain wildcard characters.
The user can enter any number of targets as command line parameters.
This code has 2 long strings available for testing; if the first
character of the first parameter is '1' the jay[42] string is used,
else kay[321].
Eg, for tests with *hay = jay use command like
./stringsearch 1e?g a?cd bc?e?g c?efg de?ga ddee? ddee?f
or with *hay = kay,
./stringsearch bc?e? jih? pa?j ?av??j
to exercise program.
Copyright 2011 James Waldby. Offered without warranty
under GPL v3 terms as at http://www.gnu.org/licenses/gpl.html
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <limits.h>
//================================================
int main(int argc, char *argv[]) {
char jay[]="abc1efgabc2efgabcde3gabcdefg4bcdefgabc5efg";
char kay[]="ludehkhtdiokihtmaihitoia1htkjkkchajajavpajkihtijkhijhipaja"
"etpajamhkajajacpajihiatokajavtoia2pkjpajjhiifakacpajjhiatkpajfojii"
"etkajamhpajajakpajihiatoiakavtoia3pakpajjhiifakacpajjhkatvpajfojii"
"ihiifojjjjhijpjkhtfdoiajadijpkoia4jihtfjavpapakjhiifjpajihiifkjach"
"ihikfkjjjjhijpjkhtfdoiajakijptoik4jihtfjakpapajjkiifjpajkhiifajkch";
char *hay = (argc>1 && argv[1][0]=='1')? jay:kay;
enum { chars=1<<CHAR_BIT, TsizMax=40, Lsiz=TsizMax+sizeof kay, L1, L2 };
int L[L2], H[chars], T[chars], g, k, par;
// Step 1. Make arrays L, H, T.
for (k=0; k<chars; ++k) H[k] = T[k] = L1; // Init H and T
for (g=0; hay[g]; ++g) { // Make linked character lists for hay.
k = hay[g]; // In same loop, could count char freqs.
if (T[k]==L1) H[k] = T[k] = g;
T[k] = L[T[k]] = g;
}
// Step 2. Read and process target strings.
for (par=1; par<argc; ++par) {
int alpha[3], at[3], a=g, b=g, c=g, da, dab, dbc, dac, i, j, r;
char * targ = argv[par];
enum { wild = '?' };
int sa=0, sb=0, sc=0, ta=0, tb=0, tc=0;
printf ("Target %d=<%s>\n", par, targ);
// Step 2A. Choose 3 non-wild characters to follow.
// As is, chooses first 3 non-wilds for a,b,c.
// Could instead choose 3 rarest characters.
for (j=0; j<3; ++j) alpha[j] = -j;
for (i=j=0; targ[i] && j<3; ++i)
if (targ[i] != wild) {
r = alpha[j] = targ[i];
if (alpha[0]==alpha[1] || alpha[1]==alpha[2]
|| alpha[0]==alpha[2]) continue;
at[j] = i;
L[T[r]] = L[g+i] = g+i;
++j;
}
if (j != 3) {
printf (" Too few target chars\n");
continue;
}
// Step 2B. Set a,b,c to head-of-list locations, set deltas.
da = at[0];
a = H[alpha[0]]; dab = at[1]-at[0];
b = H[alpha[1]]; dbc = at[2]-at[1];
c = H[alpha[2]]; dac = at[2]-at[0];
// Step 2C. See if key characters appear in haystack
if (a >= g || b >= g || c >= g) {
printf (" No match on some character\n");
continue;
}
for (g=0; hay[g]; ++g) printf ("%d", g%10);
printf ("\n%s\n", hay); // Show haystack, for user aid
// Step 2D. Search for match
while (c < g) {
// Step 2E. Enforce delta distances
while (a+dab < b) {a = L[a]; ++sa; } // Replicate these
while (b+dbc < c) {b = L[b]; ++sb; } // 3 abc lines as many
while (a+dac > c) {c = L[c]; ++sc; } // times as you like.
while (a+dab < b) {a = L[a]; ++sa; } // Replicate these
while (b+dbc < c) {b = L[b]; ++sb; } // 3 abc lines as many
while (a+dac > c) {c = L[c]; ++sc; } // times as you like.
// Step 2F. See if delta distances were met
if (a+dab==b && b+dbc==c && c<g) {
// Step 2G. Yes, so we have 3-letter-match and need to test whole match.
r = a-da;
for (k=0; targ[k]; ++k)
if ((hay[r+k] != targ[k]) && (targ[k] != wild))
break;
printf ("# %3d, %s and ", r, targ);
for (i=0; targ[i]; ++i) putchar(hay[r++]);
// Step 2H. Report match, if found
puts (targ[k]? " differ" : " match");
// Step 2I. Advance all of a,b,c, to go on looking
a = L[a]; ++ta;
b = L[b]; ++tb;
c = L[c]; ++tc;
}
}
printf ("Advances: '%c' %d+%d '%c' %d+%d '%c' %d+%d = %d+%d = %d\n",
alpha[0], sa,ta, alpha[1], sb,tb, alpha[2], sc,tc,
sa+sb+sc, ta+tb+tc, sa+sb+sc+ta+tb+tc);
}
return 0;
}
Note, if you like this answer better than current preferred answer, unmark that one and mark this one. :)

Regular expressions usually use a finite state automation-based search, I think. Try implementing that.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js