Haskell range notation to generate list. Unexpected output

Haskell range notation to generate list. Unexpected output - list

I came across an exercise in one of my lectures that left me confused on the output of [2, 2 .. 2]. Why when entering [2, 2 .. 2] it generates an "infinite" list with 2's.
The way i understood the notation was that the first element is the start bound, the second the "gap" between the numbers, and the last is the end of the list, in other words stop when reaching that number.
If my reasoning is right, why does the expression [2, 2 .. 2] not output [2]?.
I thought Haskell might evaluate it this way;
When printing the first element of
the list it is equal to the
last, therefore stop.
Or, if the first element is not checked against the "outer" bound, then the output would be [2, 2] because when adding zero to the previous number (in out case the start-bound 2) we would have reached the end and therefore stop.
I obviously do not understand the workings of the notation correctly, so how does Haskell evaluates the expression?

The notation is meant to mimic the usual way to write simple sequences mathematics. The second element is not the step, but in fact the actual second element of the list. The rest is linearly extrapolated from there. Examples:
[1,2..10] = [1,2,3,4,5,6,7,8,9,10]
[1,3..10] = [1,3,5,7,9]
[4,3..0] = [4,3,2,1,0]
[0,5..] = [0,5,10,15,20,25,30,35... -- infinite
[1,1..] = [1,1,1,1,1,1,1,1,1,1,1... -- infinite
The reason [2,2..2] is infinite is because no value of the list is ever greater than the right endpoint, which is the terminating condition. If you want the step to be 2, you should write [2,4..2], which gives the expected output [2]. (So I guess it's not the actual second element of the list in all cases, but you see the logic)

Basically because the standard specifies so . [e1, e2 .. e3] desugars to enumFromThenTo e1 e2 e3
In section 6.3.4 the Haskell '98 report says:
The sequence enumFromThenTo e1 e2 e3 is the list [e1,e1+i,e1+2i,...e3], where the increment, i, is e2-e1. If the increment is positive or zero, the list terminates when the next element would be greater than e3; the list is empty if e1 > e3. If the increment is negative, the list terminates when the next element would be less than e3; the list is empty if e1 < e3.
The next element is never greater than 2.

Related

Constructing list from user input in Prolog

I recently encounter the following problem in Prolog: Define a predicate avg_num/0 that interactively prompts the user to input a series of numbers (or the word "stop" without quotation marks to end the process). The program outputs the average of these numbers.
In Prolog terminal/interpreter, the program should run like this:
?- avg_sum.
?- Enter a number (or stop to end): 2.
?- Enter a number (or stop to end): |: 3.
?- Enter a number (or stop to end): |: 7.
?- Enter a number (or stop to end): |: stop.
The average is 4.
true.
My idea is to store the value into a list, and then compute the average of numbers of that list using the following code.
% finding the sum of numbers of a list
sumlist([],0).
sumlist([H|T],Sum):- sumlist(T,SumTail), Sum is H + SumTail .
% finding the average of the numbers in a list
average(List,A):- sumlist(List,Sum), length(List,Length), A is Sum / Length .
The problem is then reduced to: how to store the number from user input to a list? Somehow I feel that I need to initialize the empty list at the beginning, but I don't know where to put it. I've tried the following piece of code (this is incomplete though):
avg_sum:- write('Enter a number (or stop to end): '), process(Ans).
process(Ans):- read(Ans),Ans \= "stop", append(X,[Ans],L).
% this part is still incomplete and incorrect.
It seems that I need to initialize the empty list, but I don't know where.

You're almost there. As you said, you need to keep a list of inputs and then do a final computation once the entire list is known. Such a list (or other object keeping intermediate data) is often called an accumulator.
Here is one way of implementing this. It is based on two auxiliary predicates, one for doing the I/O part and one for doing the computation.
We start the computation with an empty list, as you said.
avg_num :-
avg_num([], Average),
write('The average is '), write(Average), writeln('.').
(Recall that in Prolog you can overload predicate names with different arities, i.e., avg_num/0 is a separate predicate from avg_num/2.)
Now avg_num/2 can ask for an input and hand off processing of that input to process/3:
avg_num(Accumulator, Average) :-
write('Enter a number (or stop to end): '),
read(Answer),
process(Answer, Accumulator, Average).
Only in process/3 do we either consume the accumulator by computing its average if stop was input, or (if different from stop) by simply adding the answer to the accumulator and continuing.
process(stop, Accumulator, Average) :-
average(Accumulator, Average).
process(Answer, Accumulator, Average) :-
dif(Answer, stop),
avg_num([Answer | Accumulator], Average).
Note that the order of the numbers in the list does not matter for computing the average, so we can add the input at the front of the list, which is easier than appending at the end.
The mutual recursion between the avg_num/2 and process/3 predicates is not necessarily easy to understand, especially because the names are not very well chosen.
In practice, I would get rid of process/3 by simplifying avg_num/2 like this:
avg_num(Accumulator, Average) :-
write('Enter a number (or stop to end): '),
read(Answer),
( Answer = stop
-> average(Accumulator, Average)
; avg_num([Answer | Accumulator], Average) ).

I would separate out a predicate that reads the list, then handle the list afterwards.
Here's one way to read in the list until an ending term, End:
% read_list_until
%
read_list_until(L, End) :-
( read_element(E, End)
-> L = [E|L1],
read_list_until(L1, End)
; L = []
).
read_element(E, End) :-
read(E),
dif(E, End).
This has the following behavior:
2 ?- read_list_until(L, stop).
|: 1.
|: 2.
|: 3.
|: stop.
L = [1, 2, 3].
3 ?-
Then you can just use standard Prolog predicates to take the average:
avg_list(L, Avg) :-
read_list_until(L, stop),
sum_list(L, Sum),
length(L, Num),
Avg is Sum / Num.

Pyspark-length of an element and how to use it later

So I have a dataset of words, I try to keep only those that are longer than 6 characters:
data=dataset.map(lambda word: word,len(word)).filter(len(word)>=6)
When:
print data.take(10)
it returns all of the words, including the first 3, which have length lower than 6. I dont actually want to print them, but to continue working on the data that have length greater than 6.
So when I will have the appropriate dataset, I would like to be able to select the data that I need, for example the ones that have length less than 15 and be able to make computations on them.
Or even to apply a function on the "word".
Any ideas??

What you want is something along this (untested):
data=dataset.map(lambda word: (word,len(word))).filter(lambda t : t[1] >=6)
In the map, you return a tuple of (word, length of word) and the filter will look at the length of word (the l) to take only the (w,l) whose l is greater or equal to 6

NFA to an RE Kleene's Theorem

Here is my NFA:
Here is my attempt.
Create new start and final nodes
Next eliminate the 2nd node from the left which gives me ab
Next eliminate the 2nd node from the right which gives me ab*a
Next eliminate the 2nd node from the left which gives me abb*b
Next eliminate the 2nd node from the right which gives me b+ab*a
Which leads to abbb (b+aba)*
Is this the correct answer?

No you are not correct :(
you not need to create start state. the first state with - sign is the start state. Also a,b label means a or b but not ab
there is a theorem called Arden's theoram, will be quit helpful to convert NFA into RE
What is Regular Expression for this NFA?
In you NFA the intial part of DFA:
step-1:
(-) --a,b-->(1)
means (a+b)
step-2: next from stat 1 to 2, note state 2 is accepting state final (having + sign).
(1) --b--->(2+)
So you need (a+b)b to reach to final state.
step-3: One you are at final state 2, any number of b are accepted (any number means one or more). This is because of self loop on state 2 with label b.
So, b* accepted on state-2.
step-4:
Actually there is two loops on state-2.
one is self loop with label b as I described in step-3. Its expression is b*
second loop on state-2 is via state-3.
the expression for second loop on state-2 is aa*b
why expression aa*b ?
because:
a-
|| ====> aa*b
▼|
(2+)--a-->(3) --b-->(2+)
So, In step-3 and step-4 because of loop on state-2 run can be looped back via b labeled or via aa*b ===> (b + aa*b)*
So regular expression for your NFA is:
(a+b) b (b + aa*b)*

simulate a deterministic pushdown automaton (PDA) in c++

I was reading an exercise of UVA, which I need to simulate a deterministic pushdown automaton, to see
if certain strings are accepted or not by PDA on a given entry in the following format:
The first line of input will be an integer C, which indicates the number of test cases. The first line of each test case contains five integers E, T, F, S and C, where E represents the number of states in the automaton, T the number of transitions, F represents the number of final states, S the initial state and C the number of test strings respectively. The next line will contain F integers, which represent the final states of the automaton. Then come T lines, each with 2 integers I and J and 3 strings, L, T and A, where I and J (0 ≤ I, J < E) represent the state of origin and destination of a transition state respectively. L represents the character read from the tape into the transition, T represents the symbol found at the top of the stack and A the action to perform with the top of the stack at the end of this transition (the character used to represent the bottom of the pile is always Z. to represent the end of the string, or unstack the action of not taking into account the top of the stack for the transition character is used <alt+156> £). The alphabet of the stack will be capital letters. For chain A, the symbols are stacked from right to left (in the same way that the program JFlap, ie, the new top of the stack will be the character that is to the left). Then come C lines, each with an input string. The input strings may contain lowercase letters and numbers (not necessarily present in any transition).
The output in the first line of each test case must display the following string "Case G:", where G represents the number of test case (starting at 1). Then C lines on which to print the word "OK" if the automaton accepts the string or "Reject" otherwise.
For example:
Input:
2
3 5 1 0 5
2
0 0 1 Z XZ
0 0 1 X XX
0 1 0 X X
1 1 1 X £
1 2 £ Z Z
111101111
110111
011111
1010101
11011
4 6 1 0 5
3
1 2 b A £
0 0 a Z AZ
0 1 a A AAA
1 0 a A AA
2 3 £ Z Z
2 2 b A £
aabbb
aaaabbbbbb
c1bbb
abbb
aaaaaabbbbbbbbb
this is the output:
Output:
Case 1:
Accepted
Rejected
Rejected
Rejected
Accepted
Case 2:
Accepted
Accepted
Rejected
Rejected
Accepted
I need some help, or any idea how I can simulate this PDA, I am not asking me a code that solves the problem because I want to make my own code (The idea is to learn right??), But I need some help (Some idea or pseudocode) to begin implementation.

You first need a data structure to keep transitions. You can use a vector with a transition struct that contains transition quintuples. But you can use fact that states are integer and create a vector which keeps at index 0, transitions from state 0; at index 1 transitions from state 1 like that. This way you can reduce searching time for finding correct transition.
You can easily use the stack in stl library for the stack. You also need search function it could chnage depending on your implementation if you use first method you can use a function which is like:
int findIndex(vector<quintuple> v)//which finds the index of correct transition otherwise returns -1
then use the return value to get newstate and newstack symbol.
Or you can use a for loop over the vector and bool flag which represents transition is found or not.
On second method you can use a function which takes references to new state and new stack symbol and set them if you find a appropriate transition.
For inputs you can use something like vector or vector depends on personal taste. You can implement your main method with for loops but if you want extra difficulties you can implement a recursive function. May it be easy.

The Art of Computer Programming exercise question: Chapter 1, Question 8

I'm doing the exercises to TAOCP Volume 1 Edition 3 and have trouble understanding the syntax used in the answer to the following exercise.
Chapter 1 Exercise 8
Computing the greatest common divisor of positive integers m & n by specifying Tj,sj,aj,bj
Let your input be represented by the string ambn (m a's followed by n b's)
Answer:
Let A = {a,b,c}, N=5. The algorithm will terminate with the string agcd(m,n)
j Tj sj bj aj
0 ab (empty) 1 2 Remove one a and one b, or go to 2.
1 (empty) c 0 0 Add c at extreme left, go back to 0.
2 a b 2 3 Change all a's to b's
3 c a 3 4 Change all c's to a's
4 b b 0 5 if b's remain, repeat
The part that I have trouble understanding is simply how to interpret this table.
Also, when Knuth says this will terminate with the string agcd(m,n) -- why the superscript for gcd(m,n) ?
Thanks for any help!
Edited with more questions:
What is Tj -- note that T = Theta
What is sj -- note that s = phi
How do you interpret columns bj and aj?
Why does Knuth switch a new notation in the solution to an example that he doesn't explain in the text? Just frustrating. Thanks!!!

Here's an implementation of that exercise answer. Perhaps it helps.
By the way, the table seems to describe a Markov algorithm.
As far as I understand so far, you start with the first command set, j = 0. Replace any occurencies of Tj with sj and jump to the next command line depending on if you replaced anything (in that case jump to bj, if nothing has been replaced, jump to aj).
EDIT: New answers:
A = {a,b,c} seems to be the character set you can operate with. c comes in during the algorithm (added to the left and later replaced by a's again).
Theta and phi could be some greek character you usually use for something like "original" and "replacement", although I wouldn't know they are.
bj and aj are the table lines to be next executed. This matches with the human-readable descriptions in the last column.
The only thing I can't answer is why Knuth uses this notation without any explanations. I browsed the first chapters and the solutions in the book again and he doesn't mention it anywhere.
EDIT2: Example for gdc(2,2) = 2
Input string: aabb
Line 0: Remove one a and one b, or go to 2.
=> ab => go to 1
Line 1: Add c at extreme left, go back to 0.
=> cab => go to 0
Line 0: Remove one a and one b, or go to 2.
=> c => go to 1
Line 1: Add c at extreme left, go back to 0.
=> cc => go to 0
Line 0: Remove one a and one b, or go to 2.
No ab found, so go to 2
Line 2: Change all a's to b's
No a's found, so go to 3
Line 3: Change all c's to a's
=> aa
Line 4: if b's remain, repeat
No b's found, so go to 5 (end).
=> Answer is "aa" => gdc(2,2) = 2
By the way, I think description to line 1 should be "Remove one "ab", or go to 2." This makes things a bit clearer.

The superscript for gcd(m,n) is due to how numbers are being represented in this table.
For example: m => a^m
n => b^n
gcd(m,n) => a^gcd(m,n)
It looks to be like Euclids algorithm is being implemented.
i.e.
gcd(m,n):
if n==0:
return m
return gcd(n,m%n)
The numbers are represented as powers so as to be able to do the modulo operation m%n.
For example, 4 % 3, will be computed as follows:
4 'a's (a^4) mod 3 'b's (b^3), which will leave 1 'a' (a^1).

the notion of am is probably a notion of input string in the state machine context.
Such notion is used to refer to m instances of consecutive a, i.e.:
a4 = aaaa
b7 = bbbbbbb
a4b7a3 = aaaabbbbbbbaaa
And what agcd(m,n) means is that after running the (solution) state machine, the resulting string should be gcd(m,n) instances of a
In other words, the number of a's in the result should be equal to the result of gcd(m,n)
And I agree with #schnaader in that it's probably a table describing Markov algorithm usages.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js