Comparing the contents of two lists in prolog - list

I am having some kind of homework and I am stuck to one point. I am given some facts like those:
word([h,e,l,lo]).
word([m,a,n]).
word([w,o,m,a,n]). etc
and I have to make a rule so that the user will input one list of letters and I should compare the list with the words I have and correct any possible mistakes. Here is the code I am using if the first letter is in the correct place:
mistake_letter([],[]).
mistake_letter([X|L1],[X|L2]):-
word([X|_]),
mistake_letter(L1,L2).
The problem is I don't know how to move to the next letter in the word fact. The next time the backtrack will run it will use the head of the word while I would like to use the second letter in the list. Any ideas on how to solve this?
I am sorry for any grammatical mistakes and I appreciate your help.

In order to move to the next letter in the word fact, you need to make the word from the fact a third argument, and take it along for the ride. In your mistake_letter/2, you will pick words one by one, and call mistake_letter/3, passing the word you picked along, like this:
mistake_letter(L1,L2):-
word(W),
mistake_letter(L1,L2,W).
The you'll need to change your base case to do something when the letters in the word being corrected run out before the letters of the word that you picked. What you do depends on your assignment: you could backtrack mistake_letter([],[],[])., declare a match mistake_letter([],[],_)., attach word's tail to the correction mistake_letter([],W,W). or do something else.
You also need an easy case to cover the situation when the first letter of the word being corrected matches the first letter of the word that you picked:
mistake_letter([X|L1],[X|L2],[X|WT]):-
mistake_letter(L1, L2, WT).
Finally, you need the most important case: what to do when the initial letters do not match. This is probably the bulk of your assignment: the rest is just boilerplate recursion. In order to get it right, you may need to change mistake_letter/3 to mistake_letter/4 to be able to calculate the number of matches, and later compare it to the number of letters in the original word. This would let you drop "corrections" like [w,o,r,l,d] --> [h,e,l,l,o] as having only 20% of matching letters.

Related

FLUTTER - Checking if a string contains another one

I am working on an English vocabulary learning app. Some of the exercises given to the users are written quizzes. They have to translate French words into English words and vice versa.
To make the checking a little more sophisticated than just "1" or "0" (TypedWord == expectedWord), I have been working with similarities between strings and that worked well (for spelling mistakes for example).
I had also used the contains function, so that for example, if the user adds an article in front of the expected word, it doesn't consider it wrong. (Ex : Ecole (School is expected), but user writes "A school").
So I was checking with lines such as "if (typedWord.contains(word)==true) then...". It works fine for the article problem.
But it prompts another issue :
Ex : A bough --> the expected French word is "branche". If user types "une branche", it considers it correct, which is great. But if user types "débrancher" (to unplug), it considers it correct as well as the word "branche" is a part of "débrancher"...
How could I keep this from happening ? Any idea of other ways to go about it ?
I read the three proposed answers which are really interesting. The thing is that some of the words are compound.... "Ex : kitchen appliance, garden tool" etc... so then I think the "space" functions might be problematic...
In this case, separate the whole answer with the "space", then compare it with the correct word.
For an example:
User's answer: That is my school
Separate it with space, so that you will find an array of words:
that, is, my, school.
Then compare each word with your word. It will give you the correct answer.
The flutter code will be like below:
usersAnswer?.split(" ").forEach((word){
if(word == correctAnswer)
print("this is a correct answer");
});
You can split the string by space and check if the resulting array has the word you're looking for.
typedWord.split(' ').contains('debranche');
So if typedWord is 'une branchethesplit(' ') will turn it into this array: ['une', 'branche'].
Now when you check if this array contains('branche') it will check if the exact string branche exists which in this case it does and returns true.
However if it's 'une debranche' the resulting array would be: ['une', 'debranche'] and because this array has no value equal to 'branche' it will return false. Remember that when you use split it turns the string into an array and by using contains on an array it checks whether or not an item of exactly the value you provide contains exists or not, whereas in a string it checks if part of that string matches the given value or not.
You could check for whitespaces before and after the correct word: something like if (typedWord.contains(' '+word+' ')==true) then..., so that "débrancher" gets marked as wrong. This is kind of strict, though: if the sentence must be completed with some punctuation, it would be rejected by this check. You'll probably want some RegExp that allows punctuation but not whitespaces.

How to Solve this Modified Word Ladder Problem?

Here is the word ladder problem:
Given two words (beginWord and endWord), and a dictionary's word list, find the length of the shortest transformation sequence from beginWord to endWord, such that:
Only one letter can be changed at a time.
Each transformed word must exist in the word list. Note that beginWord is not a transformed word.
Now along with the modification, we are allowed to delete or add an element.
We have to find minimum steps if possible to convert string1 to string2.
This problem has a nice BFS structure. Let's illustrate this using the example in the problem statement.
beginWord = "hit",
endWord = "cog",
wordList = "hot","dot","dog","lot","log","cog"
Since only one letter can be changed at a time, if we start from "hit", we can only change to those words which have exactly one letter different from it (in this case, "hot"). Putting in graph-theoretic terms, "hot" is a neighbor of "hit". The idea is simply to start from the beginWord, then visit its neighbors, then the non-visited neighbors of its neighbors until we arrive at the endWord. This is a typical BFS structure.
But now since we are allowed to add/delete also how should I proceed further?

given a word forming a meaningful word by adding spaces in between them

You are given a string example "Iamastudent" without any spaces. You will be provided with a predefined dictionary function which verifies whether a given word is present in the dictionary or not. Using this function you have to insert the spaces in the string a print it as "I am a student".
its my interview question and told me too solve in c++, i solved it using dynamic programming but he was not satisfied
the solution i gave is
same as in the below question
Given a phrase without spaces add spaces to make proper sentence
he asked me to do it using trie or suffix array but i couldnt able to figure the solution can any one help me
Find words and put spaces after them
The answer is to use Trie data structure. Create Trie with possible words and keep traversing. with Trie you can generate many different possible words.
now here "iamastudent" with Trie you could generate these words.
i, a, am, a, as, student
now you have to make a proper sentence out of these words. Here the possible solution is markov chain. A markov chain is data structure where it holds probability for next word after a word. so markov chain will be.
"i" : [ "am", "did", "went" ...],
"a" : [ "tree", "dog" ..]
"am" : [ "a" ...]
Now you these many data in sequence
[i], [a, am], [a, as], [student]
Note: I grouped all elements which starts with same character in one
list.
start with "i"
next word is "a". but in markov chain "a" is not there. so go for next word. like this you can continue.
from here onwards it is a dfs search for a valid sentence. well, it was a nice and tricky question.
If there is a unique solution of splitting the sentence then doing it with a trie is simple:
if there are characters in the input string start walking down from the root consuming characters from the string. otherwise terminate.
if it is a compressed trie you will find a mark whenever a prefix is a complete word otherwise if you reach a leaf that's when you output a space
go back to 1 (walking down from the root) starting from the current position in the string
You are done when there are no more characters in the string (you may want to check that at this point you are not traversing the tree).
If the solution is not unique, then whenever you reach the end of the string and you are not at a mark or a leaf in the tree you need to backtrack to the previous space you emitted. You need a stack for positions in the input string.

Word lexical families

I am given a set of N words, and an integer K. 2 words are in the same group if they have exactly the first k letters and the last k letters identical. If they have more than k letters identical or less than k letters identical then the words are not in the same group. For example:
For k=3.
"abcdefg" and "abczefg" are in the same group
"abcddefg" and "abcdzefg" are not in the same group (the first k+1 letters are identical)
"abc" and "abc" are in the same group
A word can be in more than 1 groups. For example (k=3):
"abczefg" and "abcefg" form a group
"abczaefg" and "abcefg" form a group
"abczaefg" and "abczefg" are not in the same group (the first k+1 letters are identical)
The problem asks me to find the number of groups which contain the maximum number of words.
I thought about using a Trie (or Prefix Tree) and I assume this is the right data structure for this problem but I don't know how can I adapt them for this problem, because the part where if 2 words have more than k letters identical are not in the same group confuse me. My ideea has the complexity O(N*N*K) and considering that N<=10,000 and K<=100 I don't think that this ideea is fast enough. I would like to explain you my ideea, but it is not cleary yet even for me and I don't even know if it is correct, so I will skip this part.
My question is if there is a way I could solve this problem using a faster algorithm, and if there is such algorithm, I kindly ask you to explain it a little bit. Thank you in advance and I am sorry for the gramatical mistakes and if I didn't explain the problem clearly!
First group all the words that share the first k letters and last k letters. Your largest group must sit inside one of these groups, since there's no way two words that differ at their starts and ends can be in the same solution.
So, within each of these groups (of words that share the same k letters at their start and end), you need to find a maximal set of words such that no two share the k+1'th letter, nor the k+1'th letter from the end.
Construct a graph where vertices are the pairs of letters that are (k+1) from each end (de-duping) from words in one of these groups, and edges occur between (a, b) and (c, d) if a=c or b=d.
You need to find a subgraph of this which has no edges in it. This reduced problem is an instance of the "maximum independent subgraph" problem, which is NP-hard, so you'll need to solve it by using a search and hoping the set of words you're given isn't too nasty. Perhaps there's something about the graphs here to give a faster solution, but I don't see it.
The solution to the entire problem is the largest solution to one of the reduced problems described above.
Hope this helps!

Separating out a list with regex?

I have a CSV file which has been generated by a system. The problem is with one of the fields which used to be a list of items. An example of the original list is below....
The serial number of the desk is 45TYTU
This is the second item in the list
The colour of the apple is green
The ID code is 489RUI
This is the fourth item in the list.
And unfortunately the system spits out the code below.....
The serial number of the desk is 45TYTUThis is the second item in the listThe colour of the apple is greenThe ID code is 489RUIThis is the fourth item in the list.
As you can see, it ignores the line breaks and just bunches everything up. I am unable to modify the system that generates this output so what I am trying to do is come up with some sort of regex find and replace expression that will separate them out.
My original though would be to try and detect when an upper case letter is in the middle of a lower case word, but as in one of the items in the example, when a serial number is used it throws this out.
Anyone any suggestions? Is regex the way to go?
--- EDIT ---
I think i need to simplify things for myself, if I ignore the fact that lines that end in a serial number will break things for now. I need to just create an expression that will insert a line break if it detects that an upper case letter is being used after a lower case one
--- EDIT 2 ---
Using the example given by fardjad everything works for the sample data given, the strong was...
(.(?=[A-Z][a-z]))
Now as I test with more data I can see an issue appearing, certain lines begin with numbers so it is seeing these as serial numbers, you can see an example of this at http://regexr.com?2vfi5
There are only about 10 known numbers it uses at the start of the lines such as 240v, 120v etc...
Is there a way to exclude these?
That won't be a robust solution but this is what you asked. It matches the character before an uppercase letter followed by a lowercase one. You can simply use regex replace and append a new line character:
(.(?=[A-Z][a-z]))
see this demo.
You could search for this
(?<=\p{Ll})(?=\p{Lu})
and replace with a linebreak. The regex matches the empty space between a lowercase letter \p{Ll} and an uppercase letter \p{Lu}.
This assumes you're using a Unicode-aware regex engine (.NET, PCRE, Perl for example). If not, you might also get away with
(?<=[a-z])(?=[A-Z])
but this of course only detects lower-/uppercase changes in ASCII words.