How to get all possible permutations of string and all their substrings? - combinations

I am trying to find all possible permutations of a character string and all their substrings.
For example given the input 'abc' the function should return:
['a', 'b', 'c', 'ab', 'ac', 'ba', 'bc', 'ca', 'cb', 'abc', 'acb', 'bac', 'bca', 'cab', 'cba']
I've been trying for hours on end and couldn't find any solution. Didn't find any related question either. A C# or Java solution would be prefered but it doesn't matter much. Pseudocode would be fine too.

The solution is actually annoyingly simple:
static void permSub(String s, String pre)
{
System.out.println(pre);
if(s.isEmpty()) return;
for(int i=0; i<s.length(); i++)
permSub(s.substring(0, i)+s.substring(i+1), pre+s.charAt(i));
}
Test:
public static void main(String[] args)
{
permSub("abc", "");
}
Output:
a
ab
abc
ac
acb
b
ba
bac
bc
bca
c
ca
cab
cb
cba
Note that the method outputs the empty set too, which is technically correct but you may want to filter it out.

Related

Perl regex hash to match string

Right now I have the following code...
%strings = ( 'a' => 'x',
'b0' => 'y',
'b1' => 'y',
'b2' => 'y',
...
'bN' => 'y'
'c' => 'z');
....
if(grep { $_ eq $line[0] } keys %strings){
....
}
So over all I setup this hash. $line is created by reading a file. I then look to see if the first string in the line is contained within my hash. This code works perfectly. However, my problem arises with the fact that in the hash, b is growing. For instance right now I have to explicitly list out b0 - b63. This is 64 different definitions that all just need to have the same value. Is there a way to have a regex for the hash key like b\/d\?
If you want to use a regular expression, nothing prevents you from doing so:
%strings = (
'a' => 'x',
'b\d+' => 'y',
'c' => 'z'
);
...
if( grep { $line[0] =~ /^$_$/ } keys %strings ) {
...
}
The ^ and $ are necessary to make sure the full string $line[0] matches and not only a part of it.
Bear in mind that this will be much slower than the eq comparison. On the other hand, the number of expressions to evaluate by grep will be much lower, so you may want to profile different options if the speed of execution is an issue.
Also, keep in mind that you may want to refine the regular expression. For instance, ^b\d{1,2}$ will match a b followed by one or two digits. Or even ^b[1-6]?\d$...
If I undestood you correctly,
b\d+
This will match "b" followed by any string of only numbers.
my %strings = ('a' => 'x',
map{("b$_" , 'y') } 0..63,
'c' => 'z');
should do the trick ;)
if it is what you want
if you need to add a 'b value' later in the code, you still can do $strings{"b$value"} = 'y'; to add the new value in the hash

Python Iterating through a string to look for a Palindrome

So I have looked around this site and others for information on how to iterate through a string on Python, find a specific substring, reverse it and check if the two equaled in order to get a Palindrome. This is the problem though since some of the test cases are challenging to get and have confused me on how to find them through indexing.
This is my code that works for all, but two test cases:
def countPalindromes(s):
count = 0
firstindex = 0
lastindex = len(str)-1
while firstindex != lastindex and firstindex <= lastindex:
ch1 = s[firstindex:lastindex]
ch2 = s[lastindex:firstindex:-1]
if ch1 == ch2:
count +=1
firstindex +=1
lastindex -=1
return count
This code works for the following Palindromes: "racecar", " ", and "abqc".
It does not work for these Palindromes "aaaa" and "abacccaba".
For "aaaa" there are 6 palindromes and for "abacccaba" there are 8 palindromes. This is where my problem occurs, and I simply can't figure it out. For the 6 palindromes for "aaaa" I get aaaa, aaa, aa, twice for each. For "abacccaba" the 8 palindromes I have no idea as I get abacccaba, bacccab, accca, ccc, aba, aba.
I understand this is a confusing question, but I am lost how to approach the problem since I only get 2 for the "aaaa" and 4 for "abacccaba". Any ideas how I would cut out the substrings and get these values?
Thanks in advance!
while firstindex != lastindex and firstindex <= lastindex: misses the case of a single character palindrome.
You're also missing the case where aa contains three palindromes, 0:1, 0:2 and 1:2.
I think you're missing some palindromes for aaaa; there are 10:
aaaa
a
a
a
a
aa
aa
aa
aaa
aaa
If single-character palindromes do not count, then we have 6.
Either way, you need to consider all substrings as possible palindromes; not only the ones in the middle. Comparing a string against its reversed self is very easy to do in Python: s == s[::-1].
Getting all the substrings is easy too:
def get_all_substrings(input_string):
length = len(input_string)
return [input_string[i:j+1] for i in range(length) for j in range(i,length)]
and filtering out strings of length < 2 is also easy:
substrings = [a for a in get_all_substrings(string) if len(a) > 1]
Combining these should be fairly straight forward:
len([a for a in get_all_substrings(string) if len(a) > 1 and a == a[::-1]])
I think you should write a function(f) individually to check if a string is a palindrome.
Then make a function(g) that selects sub-strings of letters.
Eg: in string abcd, g will select a, b, c, d, ab, bc, cd, abc, bcd, abcd. Then apply f on each of these strings individually to get the number of palindromes.

Convert a set of numbers into a word

I need to convert a given string of numbers to the word those numbers correspond to. For example:
>>>number_to_word ('222 2 333 33')
'CAFE'
The numbers work like they do on a cell phone, you hit once on the second button and you get an 'A', you hit twice and you get an 'B', etc. Let's say I want the letter 'E', I'd have to press the third button twice.
I would like to have some help trying to understand the easiest way to do this function. I have thought on creating a dictionary with the key being the letter and the value being the number, like this:
dic={'A':'2', 'B':'22', 'C':'222', 'D':'3', 'E':'33',etc...}
And then using a 'for' cycle to read all the numbers the in the string, but I do not know how to start.
You need to reverse your dictionary:
def number_to_word(number):
dic = {'2': 'A', '22': 'B', '222': 'C', '3': 'D', '33': 'E', '333': 'F'}
return ''.join(dic[n] for n in number.split())
>>> number_to_word('222 2 333 33')
'CAFE'
Let's start inside out. number.split() splits the text with your number at white space characters:
>>> number = '222 2 333 33'
>>> number.split()
['222', '2', '333', '33']
We use a generator expression ((dic[n] for n in number.split())) to find the letter for each number. Here is a list comprehension that does nearly the same but also shows the result as a list:
>>> [dic[n] for n in number.split()]
['C', 'A', 'F', 'E']
This lets n run through all elements in the list with the numbers and uses n as the key in the dictionary dic to get the corresponding letter.
Finally, we use the method join() with an empty string as spectator to turn the list into a string:
>>> ''.join([dic[n] for n in number.split()])
'CAFE'

how to find if a word contains a permutation of a pattern characters?

i have a pattern of length <=100 ,and a set of words <20 i wanna find the number of words that contains a permutation of the pattern characters for example if the pattern was "cat" and the set of words was "ttact tract tattc" the output should be two.
ttact: matches because it contains tac
tract: matches because it contains act
tattc: dose not match match
here is the code
public static void main(String[] args) {
String pattern="cat";
char []p=pattern.toCharArray();
Arrays.sort(p);
String sen="ttact tract tattc";
for (char c : p)
System.out.println(c);
String [] words=sen.split(" ");
if (pattern.length()==1)
{
String [] len=sen.split(pattern);
}
else
{
int count=0;
for (String word :words)
{
String found="";
for (int i=0;i<word.length();i++)
{
if (pattern.indexOf(word.charAt(i))!=-1)
{
found+=word.charAt(i);
if (found.length()==pattern.length())
{
char f [] = found.toCharArray();
Arrays.sort(f);
if (Arrays.equals(f, p))
{
count++;
found="";
}
else
found="";
}
}
else
{
found="";
}
}
}
System.out.println(count);
}}}
Any permutation of the characters in the pattern must have exactly the same length as the pattern. You could investigate all the substrings of a word with the same length as the pattern and check for each substring if it is a permutation of the pattern (for example by sorting the letters). Repeat for each word and count the matches.
You can split the solution into 2 steps
1- find all the permutations of the word you have ( cat => cat, cta , act, atc, tca, tac )
you can refer to this Finding all permutation of words in a sentence
2- find the number of occurrence of each of the result in the string you have
you can use linq, for example
var permutations=PermuteWords(input); // this function you should get it from the link above
var words = sen.Split(' '); //you will split your sentence into array of words
var count=0; // this variable will store all the occurrences and if you want to get the words that occurred, you can use list to store them
foreach(var p in permutations)
{
count+=(from w in words
where permutations.Contains(w)
select w).Count();
}
hope this will help you
if you still have any question, don't hesitate to mention it, and if it helped you, please mark it as answer.

Matching token sequences

I have a set of n tokens (e.g., a, b, c) distributed among a bunch of other tokens. I would like to know if all members of my set occur within a given number of positions (window size). It occurred to me that it may be possible to write a RegEx to capture this state, but the exact syntax eludes me.
11111
012345678901234
ab ab bc a cba
In this example, given window size=5, I would like to match cba at positions 12-14, and abc in positions 3-7.
Is there a way to do this with RegEx, or is there some other kind of grammar that I can use to capture this logic?
I am hoping to implement this in Java.
Here's a regex that matches 5-letter sequences that include all of 'a', 'b' and 'c':
(?=.{0,4}a)(?=.{0,4}b)(?=.{0,4}c).{5}
So, while basically matching any 5 characters (with .{5}), there are three preconditions the matches have to observe. Each of them requires one of the tokens/letters to be present (up to 4 characters followed by 'a', etc.). (?=X) matches "X, with a zero-width positive look-ahead", where zero-width means that the character position is not moved while matching.
Doing this with regexes is slow, though.. Here's a more direct version (seems about 15x faster than using regular expressions):
public static void find(String haystack, String tokens, int windowLen) {
char[] tokenChars = tokens.toCharArray();
int hayLen = haystack.length();
int pos = 0;
nextPos:
while (pos + windowLen <= hayLen) {
for (char c : tokenChars) {
int i = haystack.indexOf(c, pos);
if (i < 0) return;
if (i - pos >= windowLen) {
pos = i - windowLen + 1;
continue nextPos;
}
}
// match found at pos
System.out.println(pos + ".." + (pos + windowLen - 1) + ": " + haystack.substring(pos, pos + windowLen));
pos++;
}
}
This tested Java program has a commented regex which does the trick:
import java.util.regex.*;
public class TEST {
public static void main(String[] args) {
String s = "ab ab bc a cba";
Pattern p = Pattern.compile(
"# Match 5 char sequences containing: a and b and c\n" +
"(?=[abc]) # Assert first char is a, b or c.\n" +
"(?=.{0,4}a) # Assert an 'a' within 5 chars.\n" +
"(?=.{0,4}b) # Assert an 'b' within 5 chars.\n" +
"(?=.{0,4}c) # Assert an 'c' within 5 chars.\n" +
".{5} # If so, match the 5 chers.",
Pattern.COMMENTS);
Matcher m = p.matcher(s);
while (m.find()) {
System.out.print("Match = \""+ m.group() +"\"\n");
}
}
}
Note that there is another valid sequence S9:13" a cb" in your test data (before the S12:14"cba". Assuming you did not want to match this one, I added an additional constraint to filter it out, which requires that the 5 char window must begin with an a, b or c.
Here is the output from the script:
Match = "ab bc"
Match = "a cba"
Well, one possibility (albeit a completely impractical one) is simply to match against all permutations:
abc..|ab.c.|ab..c| .... etc.
This can be factorised somewhat:
ab(c..|.c.|..c)|a.(bc.|b.c .... etc.
I'm not sure if you can do better with regex.
Pattern p = Pattern.compile("(?:a()|b()|c()|.){5}\\1\\2\\3");
String s = "ab ab bc a cba";
Matcher m = p.matcher(s);
while (m.find())
{
System.out.println(m.group());
}
output:
ab bc
a cb
This is inspired by Recipe #5.7 in Regular Expressions Cookbook. Each back-reference (\1, \2, \3) acts like a zero-width assertion, indicating that the corresponding capturing group participated in the match, even though the group itself didn't consume any characters.
The authors warn that this trick relies on behavior that's undocumented in most flavors. It works in Java, .NET, Perl, PHP, Python and Ruby (original and Oniguruma), but not in JavaScript or ActionScript.