How can I do this task using Z-algorithm? - c++

In a question I am asked to find if the given string s contains two non-overlapping substrings "AB" and "BA" (the substrings can go in any order).
I have already solved this question but since I am learning Z-algorithm.Can anyone help me in that ?
I know how to find number of occurrence of a pattern in a text(by appending P and T)but I am not getting any idea how to solve this using Z algorithm ?

To find if T contains P with Z-algorithm:
S = P + '#' + T //extra char not occurring in strings
for i in 0..Length(T) - 1 do
if Z[i + Length(P) + 1] = Length(P) then
P contains T in ith position
To find if T contains both 'AB' and 'BA' without overlapping:
Sab = 'AB#' + T
Sba = 'BA#' + T
Build Zab and Zba arrays with Z-algo
PosAB_Last = Length(T) + 10 //just big value
PosAB_Prev = PosAB_Last
PosBA_Last = PosAB_Last
PosBA_Prev = PosAB_Last
for i in 0..Length(T) - 1 do
if Zab[i + 3] = 2 then
PosAB_Prev = PosAB_Last //keep two last positions of AB in text
PosAB_Last = i
//it is enough to compare positions with two last occurences of 'BA '
//so algo is linear
if (i - PosBA_Last > 1) or (i - PosBA_Prev > 1) then
Success
else
if Zba[i + 3] = 2 then
PosBA_Prev = PosBA_Last
PosBA_Last = i
if (i - PosAB_Last > 1) or (i - PosAB_Prev > 1) then
Success

Related

Find starting and ending index of each unique charcters in a string in python

I have a string with characters repeated. My Job is to find starting Index and ending index of each unique characters in that string. Below is my code.
import re
x = "aaabbbbcc"
xs = set(x)
for item in xs:
mo = re.search(item,x)
flag = item
m = mo.start()
n = mo.end()
print(flag,m,n)
Output :
a 0 1
b 3 4
c 7 8
Here the end index of the characters are not correct. I understand why it's happening but how can I pass the character to be matched dynamically to the regex search function. For instance if I hardcode the character in the search function it provides the desired output
x = 'aabbbbccc'
xs = set(x)
mo = re.search("[b]+",x)
flag = item
m = mo.start()
n = mo.end()
print(flag,m,n)
output:
b 2 5
The above function is providing correct result but here I can't pass the characters to be matched dynamically.
It will be really a help if someone can let me know how to achieve this any hint will also do. Thanks in advance
String literal formatting to the rescue:
import re
x = "aaabbbbcc"
xs = set(x)
for item in xs:
# for patterns better use raw strings - and format the letter into it
mo = re.search(fr"{item}+",x) # fr and rf work both :) its a raw formatted literal
flag = item
m = mo.start()
n = mo.end()
print(flag,m,n) # fix upper limit by n-1
Output:
a 0 3 # you do see that the upper limit is off by 1?
b 3 7 # see above for fix
c 7 9
Your pattern does not need the [] around the letter - you are matching just one anyhow.
Without regex1:
x = "aaabbbbcc"
last_ch = x[0]
start_idx = 0
# process the remainder
for idx,ch in enumerate(x[1:],1):
if last_ch == ch:
continue
else:
print(last_ch,start_idx, idx-1)
last_ch = ch
start_idx = idx
print(ch,start_idx,idx)
output:
a 0 2 # not off by 1
b 3 6
c 7 8
1RegEx: And now you have 2 problems...
Looking at the output, I'm guessing that another option would be,
import re
x = "aaabbbbcc"
xs = re.findall(r"((.)\2*)", x)
start = 0
output = ''
for item in xs:
end = start + len(item[0])
output += (f"{item[1]} {start} {end}\n")
start = end
print(output)
Output
a 0 3
b 3 7
c 7 9
I think it'll be in the Order of N, you can likely benchmark it though, if you like.
import re, time
timer_on = time.time()
for i in range(10000000):
x = "aabbbbccc"
xs = re.findall(r"((.)\2*)", x)
start = 0
output = ''
for item in xs:
end = start + len(item[0])
output += (f"{item[1]} {start} {end}\n")
start = end
timer_off = time.time()
timer_total = timer_off - timer_on
print(timer_total)

Converting into range in Python

I want to convert a list into range.
a = ['Eth1/1', 'Eth1/2', 'Eth1/3', 'Eth1/4', 'Eth1/5', 'Eth1/6', 'Eth1/7', 'Eth1/8', 'Eth1/9', 'Eth1/10','Eth2/1', 'Eth2/2', 'Eth2/3', 'Eth2/4', 'Eth2/5', 'Eth2/6','Eth3/1', 'Eth3/2', 'Eth3/3', 'Eth3/4', 'Eth3/5', 'Eth3/6','Eth4/1', 'Eth4/2', 'Eth4/3', 'Eth4/4', 'Eth4/5', 'Eth4/6']
what i am trying :
fp = open('mode.txt' , 'w+')
for i in a:
fp.write('confi ' + i + '\n mode \n')
what i am looking for :
confi Eth1/1-5
mode
confi Eth1/6-10
mode
confi Eth2/1-6
mode
confi Eth3/1-6
mode
confi Eth4/1-6
mode
Any idea how to do this ?
You could create a loop that checks the current element as the start. If it starts with Eth1 then get the 4th element after as the end. Otherwise, keep the starting Eth_, iterate through the list until you get the last Eth_ element or until the list end. Assign the last element as the end.
a = ['Eth1/1', 'Eth1/2', 'Eth1/3', 'Eth1/4', 'Eth1/5', 'Eth1/6', 'Eth1/7', 'Eth1/8', 'Eth1/9', 'Eth1/10','Eth2/1', 'Eth2/2', 'Eth2/3', 'Eth2/4', 'Eth2/5', 'Eth2/6','Eth3/1', 'Eth3/2', 'Eth3/3', 'Eth3/4', 'Eth3/5', 'Eth3/6','Eth4/1', 'Eth4/2', 'Eth4/3', 'Eth4/4', 'Eth4/5', 'Eth4/6']
i = 0
while i < len(a):
start = a[i].split('/')
if (start[0] == 'Eth1'):
i += 5
else:
key = start[0]
i += 1
while i < len(a) and a[i].split('/')[0] == key:
i += 1
end = a[i-1].split('/')
print('confi ' + start[0] + '/' + start[1] + '-' + end[1] + '\n mode\n')

Split string of digits into individual cells, including digits within parentheses/brackets

I have a column where each cell has a string of digits, ?, -, and digits in parentheses/brackets/curly brackets. A good example would be something like the following:
3????0{1012}?121-2[101]--01221111(01)1
How do I separate the string into different cells by characters, where a 'character' in this case refers to any number, ?, -, and value within the parentheses/brackets/curly brackets (including said parentheses/brackets/curly brackets)?
In essence, the string above would turn into the following (spaced apart to denote a separate cell):
3 ? ? ? ? 0 {1012} ? 1 2 1 - 2 [101] - - 0 1 2 2 1 1 1 1 (01) 1
The amount of numbers within the parentheses/brackets/curly brackets vary. There are no letters in any of the strings.
Here you are!
RegEx method:
Sub Test_RegEx()
Dim s, col, m
s = "3????0{1012}?121-2[101]--01221111(01)1"
Set col = CreateObject("Scripting.Dictionary")
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "(?:\d|-|\?|\(\d+\)|\[\d+\]|\{\d+\})"
For Each m In .Execute(s)
col(col.Count) = m
Next
End With
MsgBox Join(col.items) ' 3 ? ? ? ? 0 {1012} ? 1 2 1 - 2 [101] - - 0 1 2 2 1 1 1 1 (01) 1
End Sub
Loop method:
Sub Test_Loop()
Dim s, col, q, t, k, i
s = "3????0{1012}?121-2[101]--01221111(01)1"
Set col = CreateObject("Scripting.Dictionary")
q = "_"
t = True
k = 0
For i = 1 To Len(s)
t = (t Or InStr(1, ")]}", q) > 0) And InStr(1, "([{", q) = 0
q = Mid(s, i, 1)
If t Then k = k + 1
col(k) = col(k) & q
Next
MsgBox Join(col.items) ' 3 ? ? ? ? 0 {1012} ? 1 2 1 - 2 [101] - - 0 1 2 2 1 1 1 1 (01) 1
End Sub
Something else to look at :)
Sub test()
'String to parse through
Dim aStr As String
'final string to print
Dim finalString As String
aStr = "3????0{1012}?121-2[101]--01221111(01)1"
'Loop through string
For i = 1 To Len(aStr)
'The character to look at
char = Mid(aStr, i, 1)
'Check if the character is an opening brace, curly brace, or parenthesis
Dim result As String
Select Case char
Case "["
result = loop_until_end(Mid(aStr, i + 1), "]")
i = i + Len(result)
result = char & result
Case "("
result = loop_until_end(Mid(aStr, i + 1), ")")
i = i + Len(result)
result = char & result
Case "{"
result = loop_until_end(Mid(aStr, i + 1), "}")
i = i + Len(result)
result = char & result
Case Else
result = Mid(aStr, i, 1)
End Select
finalString = finalString & result & " "
Next
Debug.Print (finalString)
End Sub
'Loops through and concatenate to a final string until the end_char is found
'Returns a substring starting from the character after
Function loop_until_end(aStr, end_char)
idx = 1
If (Len(aStr) <= 1) Then
loop_until_end = aStr
Else
char = Mid(aStr, idx, 1)
Do Until (char = end_char)
idx = idx + 1
char = Mid(aStr, idx, 1)
Loop
End If
loop_until_end = Mid(aStr, 1, idx)
End Function
Assuming the data is in column A starting in row 1 and that you want the results start in column B and going right for each row of data in column A, here is alternate method using only worksheet formulas.
In cell B1 use this formula:
=IF(OR(LEFT(A1,1)={"(","[","{"}),LEFT(A1,MIN(FIND({")","]","}"},A1&")]}"))),IFERROR(--LEFT(A1,1),LEFT(A1,1)))
In cell C1 use this formula:
=IF(OR(MID($A1,SUMPRODUCT(LEN($B1:B1))+1,1)={"(","[","{"}),MID($A1,SUMPRODUCT(LEN($B1:B1))+1,MIN(FIND({")","]","}"},$A1&")]}",SUMPRODUCT(LEN($B1:B1))+1))-SUMPRODUCT(LEN($B1:B1))),IFERROR(--MID($A1,SUMPRODUCT(LEN($B1:B1))+1,1),MID($A1,SUMPRODUCT(LEN($B1:B1))+1,1)))
Copy the C1 formula right until it starts giving you blanks (there are no more items left to split out from the string in the A cell). In your example, need to copy it right to column AA. Then you can copy the formulas down for the rest of your Column A data.

Matching two lists in excel

I am trying to compare two months sales to each other in excel in the most automated way possible (just so it will be quicker for future months)
This months values are all worked out through formulae and last months will be copy and pasted into D:E. However as you can see there are some customers that made purchases last month and then did not this month (and vice versa). I basically need to be have all CustomerID's matching row by row. So for example it to end up like this:
Can anyone think of a good way of doing this without having to do it all manually? Thanks
Use the SUMIFS function or VLOOKUP. Like this:
http://screencast.com/t/VTBZrfHjo8tk
You should just have your entire customer list on one sheet and then add up the values associated with them month over month. The design you are describing is going to be a nightmare to maintain over time and serves no purpose. I can understand you would like to see the customers in a row like that, which is why I suggest SUMIFS.
This option compare only two columns, I think you do to think anoter way,
first I will add the date/month and then you can add down the next month value:
then you can use a simply pivot to see more month in the some time
any case if you want to format your two columns, you can use this code (you will to update with you reference, I used the date from your img example)
Sub OrderMachColumns()
Dim lastRow As Integer
Dim sortarray(1 To 2, 1 To 2) As String
Dim x As Long, y As Long
Dim TempTxt10 As String
Dim TempTxt11 As String
Dim TempTxt20 As String
Dim TempTxt22 As String
lastRow = Range("A3").End(xlDown).Row ' I use column A, same your example
For x = 3 To lastRow * 2
Cells(x, 1).Select
If Cells(x, 1) = "" Then GoTo B
If Cells(x, 4) = "" Then GoTo A
If Cells(x, 1) = Cells(x, 4) Then
Else
If Cells(x, 1).Value = Cells(x - 1, 4).Value Then
Range(Cells(x - 1, 4), Cells(x - 1, 5)).Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
ElseIf Cells(x, 1).Value = Cells(x + 1, 4).Value Then
Range(Cells(x, 1), Cells(x, 2)).Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
Else
sortarray(1, 1) = Cells(x, 1).Value
sortarray(1, 2) = "Cells(" & x & ", 1)"
sortarray(2, 1) = Cells(x, 4).Value
sortarray(2, 2) = "Cells(" & x & ", 4)"
For Z = LBound(sortarray) To UBound(sortarray)
For y = Z To UBound(sortarray)
If UCase(sortarray(y, 1)) > UCase(sortarray(Z, 1)) Then
TempTxt11 = sortarray(Z, 1)
TempTxt12 = sortarray(Z, 2)
TempTxt21 = sortarray(y, 1)
TempTxt22 = sortarray(y, 2)
sortarray(Z, 1) = TempTxt21
sortarray(y, 1) = TempTxt11
sortarray(Z, 2) = TempTxt22
sortarray(y, 2) = TempTxt12
End If
Next y
Next Z
Select Case sortarray(1, 2)
Case "Cells(" & x & ", 1)"
Range(Cells(x, 1), Cells(x, 2)).Select
Case "Cells(" & x & ", 4)"
Range(Cells(x, 4), Cells(x, 5)).Select
End Select
Selection.Insert Shift:=xlDown, CopyOrigin:=xlFormatFromLeftOrAbove
End If
End If
A:
Next x
B:
End Sub

Understanding Recursive Function

I'm working through the book NLP with Python, and I came across this example from an 'advanced' section. I'd appreciate help understanding how it works. The function computes all possibilities of a number of syllables to reach a 'meter' length n. Short syllables "S" take up one unit of length, while long syllables "L" take up two units of length. So, for a meter length of 4, the return statement looks like this:
['SSSS', 'SSL', 'SLS', 'LSS', 'LL']
The function:
def virahanka1(n):
if n == 0:
return [""]
elif n == 1:
return ["S"]
else:
s = ["S" + prosody for prosody in virahanka1(n-1)]
l = ["L" + prosody for prosody in virahanka1(n-2)]
return s + l
The part I don't understand is how the 'SSL', 'SLS', and 'LSS' matches are made, if s and l are separate lists. Also in the line "for prosody in virahanka1(n-1)," what is prosody? Is it what the function is returning each time? I'm trying to think through it step by step but I'm not getting anywhere. Thanks in advance for your help!
Adrian
Let's just build the function from scratch. That's a good way to understand it thoroughly.
Suppose then that we want a recursive function to enumerate every combination of Ls and Ss to make a given meter length n. Let's just consider some simple cases:
n = 0: Only way to do this is with an empty string.
n = 1: Only way to do this is with a single S.
n = 2: You can do it with a single L, or two Ss.
n = 3: LS, SL, SSS.
Now, think about how you might build the answer for n = 4 given the above data. Well, the answer would either involve adding an S to a meter length of 3, or adding an L to a meter length of 2. So, the answer in this case would be LL, LSS from n = 2 and SLS, SSL, SSSS from n = 3. You can check that this is all possible combinations. We can also see that n = 2 and n = 3 can be obtained from n = 0,1 and n=1,2 similarly, so we don't need to special-case them.
Generally, then, for n ≥ 2, you can derive the strings for length n by looking at strings of length n-1 and length n-2.
Then, the answer is obvious:
if n = 0, return just an empty string
if n = 1, return a single S
otherwise, return the result of adding an S to all strings of meter length n-1, combined with the result of adding an L to all strings of meter length n-2.
By the way, the function as written is a bit inefficient because it recalculates a lot of values. That would make it very slow if you asked for e.g. n = 30. You can make it faster very easily by using the new lru_cache from Python 3.3:
#lru_cache(maxsize=None)
def virahanka1(n):
...
This caches results for each n, making it much faster.
I tried to melt my brain. I added print statements to explain to me what was happening. I think the most confusing part about recursive calls is that it seems to go into the call forward but come out backwards, as you may see with the prints when you run the following code;
def virahanka1(n):
if n == 4:
print 'Lets Begin for ', n
else:
print 'recursive call for ', n, '\n'
if n == 0:
print 'n = 0 so adding "" to below'
return [""]
elif n == 1:
print 'n = 1 so returning S for below'
return ["S"]
else:
print 'next recursivly call ' + str(n) + '-1 for S'
s = ["S" + prosody for prosody in virahanka1(n-1)]
print '"S" + each string in s equals', s
if n == 4:
print '**Above is the result for s**'
print 'n =',n,'\n', 'next recursivly call ' + str(n) + '-2 for L'
l = ["L" + prosody for prosody in virahanka1(n-2)]
print '\t','what was returned + each string in l now equals', l
if n == 4:
print '**Above is the result for l**','\n','**Below is the end result of s + l**'
print 'returning s + l',s+l,'for below', '\n','='*70
return s + l
virahanka1(4)
Still confusing for me, but with this and Jocke's elegant explanation, I think I can understand what is going on.
How about you?
Below is what the code above produces;
Lets Begin for 4
next recursivly call 4-1 for S
recursive call for 3
next recursivly call 3-1 for S
recursive call for 2
next recursivly call 2-1 for S
recursive call for 1
n = 1 so returning S for below
"S" + each string in s equals ['SS']
n = 2
next recursivly call 2-2 for L
recursive call for 0
n = 0 so adding "" to below
what was returned + each string in l now equals ['L']
returning s + l ['SS', 'L'] for below
======================================================================
"S" + each string in s equals ['SSS', 'SL']
n = 3
next recursivly call 3-2 for L
recursive call for 1
n = 1 so returning S for below
what was returned + each string in l now equals ['LS']
returning s + l ['SSS', 'SL', 'LS'] for below
======================================================================
"S" + each string in s equals ['SSSS', 'SSL', 'SLS']
**Above is the result for s**
n = 4
next recursivly call 4-2 for L
recursive call for 2
next recursivly call 2-1 for S
recursive call for 1
n = 1 so returning S for below
"S" + each string in s equals ['SS']
n = 2
next recursivly call 2-2 for L
recursive call for 0
n = 0 so adding "" to below
what was returned + each string in l now equals ['L']
returning s + l ['SS', 'L'] for below
======================================================================
what was returned + each string in l now equals ['LSS', 'LL']
**Above is the result for l**
**Below is the end result of s + l**
returning s + l ['SSSS', 'SSL', 'SLS', 'LSS', 'LL'] for below
======================================================================
This function says that:
virakhanka1(n) is the same as [""] when n is zero, ["S"] when n is 1, and s + l otherwise.
Where s is the same as the result of "S" prepended to each elements in the resulting list of virahanka1(n - 1), and l the same as "L" prepended to the elements of virahanka1(n - 2).
So the computation would be:
When n is 0:
[""]
When n is 1:
["S"]
When n is 2:
s = ["S" + "S"]
l = ["L" + ""]
s + l = ["SS", "L"]
When n is 3:
s = ["S" + "SS", "S" + "L"]
l = ["L" + "S"]
s + l = ["SSS", "SL", "LS"]
When n is 4:
s = ["S" + "SSS", "S" + "SL", "S" + "LS"]
l = ["L" + "SS", "L" + "L"]
s + l = ['SSSS", "SSL", "SLS", "LSS", "LL"]
And there you have it, step by step.
You need to know the results of the other function calls in order to calculate the final value, which can be pretty messy to do manually as you can see. It is important though that you do not try to think recursively in your head. This would cause your mind to melt. I described the function in words, so that you can see that these kind of functions is are descriptions, and not a sequence of commands.
The prosody you see, that is a part of s and l definitions, are variables. They are used in a list-comprehension, which is a way of building lists. I've described earlier how this list is built.