Constructing list from user input in Prolog - list

I recently encounter the following problem in Prolog: Define a predicate avg_num/0 that interactively prompts the user to input a series of numbers (or the word "stop" without quotation marks to end the process). The program outputs the average of these numbers.
In Prolog terminal/interpreter, the program should run like this:
?- avg_sum.
?- Enter a number (or stop to end): 2.
?- Enter a number (or stop to end): |: 3.
?- Enter a number (or stop to end): |: 7.
?- Enter a number (or stop to end): |: stop.
The average is 4.
true.
My idea is to store the value into a list, and then compute the average of numbers of that list using the following code.
% finding the sum of numbers of a list
sumlist([],0).
sumlist([H|T],Sum):- sumlist(T,SumTail), Sum is H + SumTail .
% finding the average of the numbers in a list
average(List,A):- sumlist(List,Sum), length(List,Length), A is Sum / Length .
The problem is then reduced to: how to store the number from user input to a list? Somehow I feel that I need to initialize the empty list at the beginning, but I don't know where to put it. I've tried the following piece of code (this is incomplete though):
avg_sum:- write('Enter a number (or stop to end): '), process(Ans).
process(Ans):- read(Ans),Ans \= "stop", append(X,[Ans],L).
% this part is still incomplete and incorrect.
It seems that I need to initialize the empty list, but I don't know where.

You're almost there. As you said, you need to keep a list of inputs and then do a final computation once the entire list is known. Such a list (or other object keeping intermediate data) is often called an accumulator.
Here is one way of implementing this. It is based on two auxiliary predicates, one for doing the I/O part and one for doing the computation.
We start the computation with an empty list, as you said.
avg_num :-
avg_num([], Average),
write('The average is '), write(Average), writeln('.').
(Recall that in Prolog you can overload predicate names with different arities, i.e., avg_num/0 is a separate predicate from avg_num/2.)
Now avg_num/2 can ask for an input and hand off processing of that input to process/3:
avg_num(Accumulator, Average) :-
write('Enter a number (or stop to end): '),
read(Answer),
process(Answer, Accumulator, Average).
Only in process/3 do we either consume the accumulator by computing its average if stop was input, or (if different from stop) by simply adding the answer to the accumulator and continuing.
process(stop, Accumulator, Average) :-
average(Accumulator, Average).
process(Answer, Accumulator, Average) :-
dif(Answer, stop),
avg_num([Answer | Accumulator], Average).
Note that the order of the numbers in the list does not matter for computing the average, so we can add the input at the front of the list, which is easier than appending at the end.
The mutual recursion between the avg_num/2 and process/3 predicates is not necessarily easy to understand, especially because the names are not very well chosen.
In practice, I would get rid of process/3 by simplifying avg_num/2 like this:
avg_num(Accumulator, Average) :-
write('Enter a number (or stop to end): '),
read(Answer),
( Answer = stop
-> average(Accumulator, Average)
; avg_num([Answer | Accumulator], Average) ).

I would separate out a predicate that reads the list, then handle the list afterwards.
Here's one way to read in the list until an ending term, End:
% read_list_until
%
read_list_until(L, End) :-
( read_element(E, End)
-> L = [E|L1],
read_list_until(L1, End)
; L = []
).
read_element(E, End) :-
read(E),
dif(E, End).
This has the following behavior:
2 ?- read_list_until(L, stop).
|: 1.
|: 2.
|: 3.
|: stop.
L = [1, 2, 3].
3 ?-
Then you can just use standard Prolog predicates to take the average:
avg_list(L, Avg) :-
read_list_until(L, stop),
sum_list(L, Sum),
length(L, Num),
Avg is Sum / Num.

Related

List in Prolog with elements in round brackets

Good Morning
I have list similiar to this: [(1-4), (2-4), (3-4)]. I'd like to write only first/second/third part of round bracket. I wrote a function:
write_list([]).
write_list([Head|Tail]) :-
write(Head), nl,
write_list(Tail).
It only writes whole round bracket:
1-4
2-4
3-4
I'd like my output to be the 1st element of round bracket:
1
2
3
I'll be grateful for any help :D
Here you are:
write_list([]).
write_list([(A-_)|Tail]) :-
writeln(A),
write_list(Tail).
Query:
?- write_list([(1-4),(2-4),(3-4)]).
1
2
3
true
writeln/1 is simply write/1 followed by nl .
You don't really want to write the results but provide them as an argument. Many beginners in Prolog get stuck on this point. Also, it's such a common pattern to apply the same logic to each list element that Prolog has a predicate called maplist for doing the work for you:
first_subterm(A-_, A). % First subterm of `A-_` is `A`
first_subterms(PairList, FirstSubTerms) :-
maplist(first_subterm, PairList, FirstSubTerms).
And you would call it like so:
| ?- first_subterms([(1-4), (2-4), (3-4)], FirstSubTerms).
FirstSubTerms = [1,2,3]
yes
| ?-
The long-hand recursive form would be similar to what was given in the other answer:
first_subterms([], []). % The list of first subterms of [] is []
first_subterms([(A-_)|Pairs], [A|SubTerms]) :-
first_subterms(Pairs, SubTerms).
Note that the "round brackets" are parentheses and, in Prolog, in this context only perform a grouping of the term. It turns out that [(1-4), (2-4), (3-4)] here behaves the same, therefore, as [1-4, 2-4, 3-4] since the , is lower precedence than - in the list notation. So this is also the behavior:
| ?- first_subterms([1-4, 2-4, 3-4], FirstSubTerms).
FirstSubTerms = [1,2,3]
yes
| ?-

Pyspark-length of an element and how to use it later

So I have a dataset of words, I try to keep only those that are longer than 6 characters:
data=dataset.map(lambda word: word,len(word)).filter(len(word)>=6)
When:
print data.take(10)
it returns all of the words, including the first 3, which have length lower than 6. I dont actually want to print them, but to continue working on the data that have length greater than 6.
So when I will have the appropriate dataset, I would like to be able to select the data that I need, for example the ones that have length less than 15 and be able to make computations on them.
Or even to apply a function on the "word".
Any ideas??
What you want is something along this (untested):
data=dataset.map(lambda word: (word,len(word))).filter(lambda t : t[1] >=6)
In the map, you return a tuple of (word, length of word) and the filter will look at the length of word (the l) to take only the (w,l) whose l is greater or equal to 6

Every other letter

So, I have tried this problem for what it seems like a hundred times this week alone.
It's filling in the blank for the following program...
You entered jackson and ville.
When these are combined, it makes jacksonville.
Taking every other letter gives us jcsnil.
The blanks I have filled are fine, but the rest of the blanks, I can't figure out. Here they are.
x = raw_input("Enter a word: ")
y = raw_input("Enter another word: ")
print("You entered %s and %s." % (x,y))
combined = x + y
print("When these are combined, it makes %s." % combined)
every_other = ""
counter = 0
for __________________ :
if ___________________ :
every_other = every_other + letter
____________
print("Taking every other letter gives us %s." % every_other)
I just need three blanks to this program. This is basic python, so nothing too complicated or something I can match wit the twenty options. Please, I appreciate your help!
The first blank needs to define letter so that each time through the loop it is the letter at position counter in combined
The second blank needs to test for the current letter's position being one that gets included
The last blank needs to modify counter for the next value of letter (much as the initial value of counter was for the first letter).
The solution is to slice with a step value.
In [10]: "jacksonville"[::2]
Out[10]: 'jcsnil'
The slice notation means "take the subset starting at the beginning of the iterable, ending at the end of the iterable, selecting every second element". Remember that Python slices start by selecting the first element available in the slice.
EDIT: Didn't realize it had to fill in the blanks
for letter in combined:
if(counter % 2) == 0:
every_other = every_other + letter
counter += 1
Since taking every other would mean you take every second letter, or every second pass through the loop, and you use counter to track how many passes you've made, you can use modulo division (%) to check when to take a letter. The base case is that 0 % 2 = 0, which lets you take the first letter. It's important to remember to always increment the counter.
A way to do this without the manual counter, which was already mentioned in comments is to use the enumerate function on combined. When given an iterable as a parameter, enumerate returns a generator which yields two values with each request, the position in the iterable, and the value of the iterable at that position.
I'm using this language, talking about iterables as index-able sequences, but it could be any generator-like object which doesn't have to have a finite, pre-defined sequence.

Haskell range notation to generate list. Unexpected output

I came across an exercise in one of my lectures that left me confused on the output of [2, 2 .. 2]. Why when entering [2, 2 .. 2] it generates an "infinite" list with 2's.
The way i understood the notation was that the first element is the start bound, the second the "gap" between the numbers, and the last is the end of the list, in other words stop when reaching that number.
If my reasoning is right, why does the expression [2, 2 .. 2] not output [2]?.
I thought Haskell might evaluate it this way;
When printing the first element of
the list it is equal to the
last, therefore stop.
Or, if the first element is not checked against the "outer" bound, then the output would be [2, 2] because when adding zero to the previous number (in out case the start-bound 2) we would have reached the end and therefore stop.
I obviously do not understand the workings of the notation correctly, so how does Haskell evaluates the expression?
The notation is meant to mimic the usual way to write simple sequences mathematics. The second element is not the step, but in fact the actual second element of the list. The rest is linearly extrapolated from there. Examples:
[1,2..10] = [1,2,3,4,5,6,7,8,9,10]
[1,3..10] = [1,3,5,7,9]
[4,3..0] = [4,3,2,1,0]
[0,5..] = [0,5,10,15,20,25,30,35... -- infinite
[1,1..] = [1,1,1,1,1,1,1,1,1,1,1... -- infinite
The reason [2,2..2] is infinite is because no value of the list is ever greater than the right endpoint, which is the terminating condition. If you want the step to be 2, you should write [2,4..2], which gives the expected output [2]. (So I guess it's not the actual second element of the list in all cases, but you see the logic)
Basically because the standard specifies so . [e1, e2 .. e3] desugars to enumFromThenTo e1 e2 e3
In section 6.3.4 the Haskell '98 report says:
The sequence enumFromThenTo e1 e2 e3 is the list [e1,e1+i,e1+2i,...e3], where the increment, i, is e2-e1. If the increment is positive or zero, the list terminates when the next element would be greater than e3; the list is empty if e1 > e3. If the increment is negative, the list terminates when the next element would be less than e3; the list is empty if e1 < e3.
The next element is never greater than 2.

A confusion about the porter stemming algorithm

I am trying to implement porter stemming algorithm, but I stumbled at this point
where the square brackets denote
arbitrary presence of their contents.
Using (VC){m} to denote VC repeated m
times, this may again be written as
[C](VC){m}[V].
m will be called the \measure\ of any
word or word part when represented in
this form. The case m = 0 covers the
null word. Here are some examples:
m=0 TR, EE, TREE, Y, BY.
m=1 TROUBLE, OATS, TREES, IVY.
m=2 TROUBLES, PRIVATE, OATEN, ORRERY.
I don't understand what is this "measure" and what does it stand for?
Looks like the measure is the number of times a vowel is immediately followed by a consonant. For example,
"TROUBLES" has:
Optional initial consonants [C] = "TR".
First vowels-consonants group (VC) = "OUBL".
Second vowels-consonants group (VC) = "ES".
Optional ending vowels [V] is empty.
So the measure is two, the number of times (VC) was "matched".