Haskell manipulating elements of a list - list

So lets say i have a certain string and i want to check the elements of the string, whether they are numbers or characters. Each number has to be replaced with the number 1 and each character with the number 2 and at the end it has to be shown the final result when you sum all numbers.
Example: function "123abc" has to give a result 9
I've already come up with a solution using lists comprehensions and pattern matching, but i need to be able to give a solution without using them, meaning only elem,head,tail,reverse,map,sum etc. All has to be 1 function, now a few combined as one.

You may do as follows;
import Data.Char (isDigit)
import Data.Bool (bool)
getsum :: String -> Int
getsum = sum . map (bool 2 1 . isDigit)
*Main> getsum "1234abc"
10

Related

Match list of incrementing integers using regex

Is it possible to match a list of comma-separated decimal integers, where the integers in the list always increment by one?
These should match:
0,1,2,3
8,9,10,11
1999,2000,2001
99,100,101
These should not match (in their entirety - the last two have matching subsequences):
42
3,2,1
1,2,4
10,11,13
Yes, this is possible when using a regex engine that supports backreferences and conditions.
First, the list of consecutive numbers can be decomposed into a list where each pair of numbers are consecutive:
(?=(?&cons))\d+
(?:,(?=(?&cons))\d+)*
,\d+
Here (?=(?&cons)) is a placeholder for a predicate that ensures that two numbers are consecutive. This predicate might look as follows:
(?<cons>\b(?:
(?<x>\d*)
(?:(?<a0>0)|(?<a1>1)|(?<a2>2)|(?<a3>3)|(?<a4>4)
|(?<a5>5)|(?<a6>6)|(?<a7>7)|(?<a8>8))
(?:9(?= 9*,\g{x}\d (?<y>\g{y}?+ 0)))*
,\g{x}
(?(a0)1)(?(a1)2)(?(a2)3)(?(a3)4)(?(a4)5)
(?(a5)6)(?(a6)7)(?(a7)8)(?(a8)9)
(?(y)\g{y})
# handle the 999 => 1000 case separately
| (?:9(?= 9*,1 (?<z>\g{z}?+ 0)))+
,1\g{z}
)\b)
For a brief explanation, the second case handling 999,1000 type pairs is easier to understand -- there is a very detailed description of how it works in this answer concerned with matching a^n b^n. The connection between the two is that in this case we need to match 9^n ,1 0^n.
The first case is slightly more complicated. The largest part of it handles the simple case of incrementing a decimal digit, which is relatively verbose due to the number of said digits:
(?:(?<a0>0)|(?<a1>1)|(?<a2>2)|(?<a3>3)|(?<a4>4)
|(?<a5>5)|(?<a6>6)|(?<a7>7)|(?<a8>8))
(?(a0)1)(?(a1)2)(?(a2)3)(?(a3)4)(?(a4)5)
(?(a5)6)(?(a6)7)(?(a7)8)(?(a8)9)
The first block will capture whether the digit is N into group aN and the second block will then uses conditionals to check which of these groups was used. If group aN is non-empty, the next digit should be N+1.
The remainder of the first case handles cases like 1999,2000. This again falls into the pattern N 9^n, N+1 0^n, so this is a combination of the method for matching a^n b^n and incrementing a decimal digit. The simple case of 1,2 is handled as the limiting case where n=0.
Complete regex: https://regex101.com/r/zG4zV0/1
Alternatively the (?&cons) predicate can be implemented slightly more directly if recursive subpattern references are supported:
(?<cons>\b(?:
(?<x>\d*)
(?:(?<a0>0)|(?<a1>1)|(?<a2>2)|(?<a3>3)|(?<a4>4)
|(?<a5>5)|(?<a6>6)|(?<a7>7)|(?<a8>8))
(?<y>
,\g{x}
(?(a0)1)(?(a1)2)(?(a2)3)(?(a3)4)(?(a4)5)
(?(a5)6)(?(a6)7)(?(a7)8)(?(a8)9)
| 9 (?&y) 0
)
# handle the 999 => 1000 case separately
| (?<z> 9,10 | 9(?&z)0 )
)\b)
In this case the two grammars 9^n ,1 0^n, n>=1 and prefix N 9^n , prefix N+1 0^n, n>=0 are pretty much just written out explicitly.
Complete alternative regex: https://regex101.com/r/zG4zV0/3

Is it possible for a regex to identify all equal value subarrays?

An equal value subarray is a subarray containing one or more consecutive elements of the same value.
For example, lets say our array is:
1,1,3
There are four equal value sub-arrays:
[1], [1], [3], [1,1]
Note that elements can be part of more than one subarray.
I know [\d] matches digits, but this requirement is failing me. I am asking regex solution out of curiosity.
There's no way to do this with one regex. In fact, I recommend that you use more than one version of the string.
This regex should work:
^(\d+)(,\1){n}
I've made some adjustments to ensure a more robust regex:
Allows for numbers greater than 10
Will only match at the start, ensuring the count is not thrown off
For an array of length 4, you should replace n with 0, 1, 2, 3. This means that you will have to match against four regexes.
(Note that n=0 is the same as ^(\d+))
Furthermore, you will have to "behead" the string, meaning that you would first match against 1,1,1,3 (new example) and then 1,1,3, and then 1,3, and then 3.
Fun fact: you can use a regex to behead the string (group 1 will have the beheaded string):
^\d+,(.*)
(Obviously, you will need to ensure that you're not trying to behead an array of size 1.)
For an array of size 4, you will need to match against 4+3+2+1=10 regexes. You should test to see if the regex matched; if it did, you know to increment your count by 1. (Note that 10 is the maximum number of consecutive combinations for an array of 4.)
Here's an explanation of why you need to use more than one string. Take this regex:
(\d)(,?\1){n}
Again, n needs to be replaced. You would also need to use the g modifier (or its equivalent).
I'll use your example of 1,1,1,1:
n=0 gives 4 matches
n=1 gives 2 matches
n=2 gives 1 match
n=3 gives 1 match
As you can see, it does not handle overlapping matches very well, because that's not how regex was designed.

Split list of lists (integers) by consecutive order into separate lists

I have
List1 = [[...11,12,13,14,7,8,9],[0,1,2,3]]
where ... are consecutive integers starting from from 0 up to 11
I want
List2 = [[11,12,13,14],[7,8,9],[0,1,2,3]]
EDIT:
Found the answer:
from Python: split list of integers based on step between them
[list(g) for k, g in groupby(listName, key=lambda i,j=count(): i-next(j))]
Turns out, that wasn't what I was looking for. I need to be able to split the list of lists of integers by consecutive order ONLY if the next integer in that list was of lesser value than the preceding integer.
e.g.
[[0,1,2,15,16,17,2,3,4,6,8,9]]
should be split into
[[0,1,2,15,16,17],[2,3,4,6,8,9]]

Pyspark-length of an element and how to use it later

So I have a dataset of words, I try to keep only those that are longer than 6 characters:
data=dataset.map(lambda word: word,len(word)).filter(len(word)>=6)
When:
print data.take(10)
it returns all of the words, including the first 3, which have length lower than 6. I dont actually want to print them, but to continue working on the data that have length greater than 6.
So when I will have the appropriate dataset, I would like to be able to select the data that I need, for example the ones that have length less than 15 and be able to make computations on them.
Or even to apply a function on the "word".
Any ideas??
What you want is something along this (untested):
data=dataset.map(lambda word: (word,len(word))).filter(lambda t : t[1] >=6)
In the map, you return a tuple of (word, length of word) and the filter will look at the length of word (the l) to take only the (w,l) whose l is greater or equal to 6

simulate a deterministic pushdown automaton (PDA) in c++

I was reading an exercise of UVA, which I need to simulate a deterministic pushdown automaton, to see
if certain strings are accepted or not by PDA on a given entry in the following format:
The first line of input will be an integer C, which indicates the number of test cases. The first line of each test case contains five integers E, T, F, S and C, where E represents the number of states in the automaton, T the number of transitions, F represents the number of final states, S the initial state and C the number of test strings respectively. The next line will contain F integers, which represent the final states of the automaton. Then come T lines, each with 2 integers I and J and 3 strings, L, T and A, where I and J (0 ≤ I, J < E) represent the state of origin and destination of a transition state respectively. L represents the character read from the tape into the transition, T represents the symbol found at the top of the stack and A the action to perform with the top of the stack at the end of this transition (the character used to represent the bottom of the pile is always Z. to represent the end of the string, or unstack the action of not taking into account the top of the stack for the transition character is used <alt+156> £). The alphabet of the stack will be capital letters. For chain A, the symbols are stacked from right to left (in the same way that the program JFlap, ie, the new top of the stack will be the character that is to the left). Then come C lines, each with an input string. The input strings may contain lowercase letters and numbers (not necessarily present in any transition).
The output in the first line of each test case must display the following string "Case G:", where G represents the number of test case (starting at 1). Then C lines on which to print the word "OK" if the automaton accepts the string or "Reject" otherwise.
For example:
Input:
2
3 5 1 0 5
2
0 0 1 Z XZ
0 0 1 X XX
0 1 0 X X
1 1 1 X £
1 2 £ Z Z
111101111
110111
011111
1010101
11011
4 6 1 0 5
3
1 2 b A £
0 0 a Z AZ
0 1 a A AAA
1 0 a A AA
2 3 £ Z Z
2 2 b A £
aabbb
aaaabbbbbb
c1bbb
abbb
aaaaaabbbbbbbbb
this is the output:
Output:
Case 1:
Accepted
Rejected
Rejected
Rejected
Accepted
Case 2:
Accepted
Accepted
Rejected
Rejected
Accepted
I need some help, or any idea how I can simulate this PDA, I am not asking me a code that solves the problem because I want to make my own code (The idea is to learn right??), But I need some help (Some idea or pseudocode) to begin implementation.
You first need a data structure to keep transitions. You can use a vector with a transition struct that contains transition quintuples. But you can use fact that states are integer and create a vector which keeps at index 0, transitions from state 0; at index 1 transitions from state 1 like that. This way you can reduce searching time for finding correct transition.
You can easily use the stack in stl library for the stack. You also need search function it could chnage depending on your implementation if you use first method you can use a function which is like:
int findIndex(vector<quintuple> v)//which finds the index of correct transition otherwise returns -1
then use the return value to get newstate and newstack symbol.
Or you can use a for loop over the vector and bool flag which represents transition is found or not.
On second method you can use a function which takes references to new state and new stack symbol and set them if you find a appropriate transition.
For inputs you can use something like vector or vector depends on personal taste. You can implement your main method with for loops but if you want extra difficulties you can implement a recursive function. May it be easy.