I need to create a function that replaces a letter with the letter 13 letters after it in the alphabet (without using encode). I'm relatively new to Python so it has taken me a while to figure out a way to do this without using Encode.
Here's what I have so far. When I use this to type in a normal word like "hello" it works but if I pass through a sentence with special characters I can't figure out how to JUST include letters of the alphabet and skip numbers, spaces or special characters completely.
def rot13(b):
b = b.lower()
a = [chr(i) for i in range(ord('a'),ord('z')+1)]
c = []
d = []
x = a[0:13]
for i in b:
c.append(a.index(i))
for i in c:
if i <= 13:
d.append(a[i::13][1])
elif i > 13:
y = len(a[i:])
z = len(x)- y
d.append(a[z::13][0])
e = ''.join(d)
return e
EDIT
I tried using .isalpha() but this doesn't seem to be working for me - characters are duplicating for some reason when I use it. Is the following format correct:
def rot13(b):
b1 = b.lower()
a = [chr(i) for i in range(ord('a'),ord('z')+1)]
c = []
d = []
x = a[0:13]
for i in b1:
if i.isalpha():
c.append(a.index(i))
for i in c:
if i <= 12:
d.append(a[i::13][1])
elif i > 12:
y = len(a[i:])
z = len(x)- y
d.append(a[z::13][0])
else:
d.append(i)
if message[0].istitle() == True:
d[0] = d[0].upper()
e = ''.join(d)
return e
Following on from comments. OP was advised to use isalpha, and wondering why that's causing duplication (see OP's edit)
This isn't tied to the use of isalpha, it's to do with the second for loop
for i in c:
isn't necessary, and is causing the duplication. You should remove that. Instead you can do the same by just using index = a.index(i). You were already doing this, but for some reason appending to a list instead and causing confusion
Use the index variable any time you would have used i inside the for i in c loop. On a side note, in nested for loops try not to reuse the same variables. It just causes confusion...but that's a matter for code review
Assuming you do all that right it should work.
Related
I want to take user input for message. Then I generate a random key using random package in python.
But how to shift each letter in message using key's ascii value to produce output as string only?
Example :
message = hi
random key generated = bi
encrypted message = "something in alphabets only like xh or mo."
use the objectype dictionary to map each character to another so that you create a new alphabet, and the loop through the dictionary and replace them using the dictionary
stringa = input()
swapa = {"A":"Q", "B":"A","C":"L"...}
for i in swapa:
stringa = stringa.replace(i,swapa[i])
print(stringa)
you could also take it a step fourther and encryot and decrypt using a keyword
ite = int(input())
for itar in range(ite):
keyw = list(input())
# removes dublicates and keep order
rem = set()
for i in keyw:
if keyw.count(i) > 1:
rem.add(i)
for i in rem:
keyw= keyw[::-1]
keyw.remove(i)
keyw= keyw[::-1]
keyw = "".join(keyw)
# sets up alfabet
linaalfa = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
linaalfa += " "*(len(keyw)-len(linaalfa)%len(keyw)) # adds spaces on the last line
linaalf = "".join([i for i in linaalfa if i not in keyw]) # removes dublicates
linaalf1 = [keyw]+[linaalf[x:x+len(keyw)] for x in range(0,len(linaalf),len(keyw))]
# alfa order find
order = dict()
for w,i in enumerate(sorted(keyw)):
order[keyw.index(i)] = w
# order in alfa ordder
orderalfa = ["".join([i[q] for i in linaalf1]) for q in range(len(keyw))] # now read by column
temp = [""]*len(orderalfa)
for w,i in enumerate(orderalfa):
temp[order[w]] = i
orderalfa = temp
orderalfa = "".join([x.strip() for x in orderalfa])
# relate to base alfabet
translate = dict()
encrypt = dict()
for w,i in enumerate(orderalfa):
translate[i] = linaalfa[w]
encrypt[linaalfa[w]] = i
# trans late message
lina = input().split()
outa = list()
for i in lina:
temp = ""
for q in i:
temp += translate[q]
outa.append(temp)
print(" ".join(outa))
here it takes the input
n # number of quaries
keyword
string that need to be operated
the last bit however is set to decrypt the message if you want it to encrupt it you need to replace the line
temp += translate[q]
to
temp += encrypt[q]
This takes a keyword and
removes dublicate letters in said keyword
Set the keyword in front of the an alfabet(a normal one)
and splits the alfabet up in peaces the same length as the keyword
places them above each other
orderes them in column order based on if the keyword's letter were written in alfabetical order
takes each coloumn and creats a new alphabet based on the each coloum put continualy after each other
this new alphabet now works as the new alphabet
for example if the beginning of the new alphabet were "HGJ" than A would be H, and B would be G and J would be C.
This is stil only a monoalphabetical encryption though.
I am trying to create a sequence of similar dictionaries to further store them in a tuple. I tried two approaches, using and not using a for loop
Without for loop
dic0 = {'modo': lambda x: x[0]}
dic1 = {'modo': lambda x: x[1]}
lst = []
lst.append(dic0)
lst.append(dic1)
tup = tuple(lst)
dic0 = tup[0]
dic1 = tup[1]
f0 = dic0['modo']
f1 = dic1['modo']
x = np.array([0,1])
print (f0(x) , f1(x)) # 0 , 1
With a for loop
lst = []
for j in range(0,2):
dic = {}
dic = {'modo': lambda x: x[j]}
lst.insert(j,dic)
tup = tuple(lst)
dic0 = tup[0]
dic1 = tup[1]
f0 = dic0['modo']
f1 = dic1['modo']
x = np.array([0,1])
print (f0(x) , f1(x)) # 1 , 1
I really don't understand why I am getting different results. It seems that the last dictionary I insert overwrite the previous ones, but I don't know why (the append method does not work neither).
Any help would be really welcomed
This is happening due to how scoping works in this case. Try putting j = 0 above the final print statement and you'll see what happens.
Also, you might try
from operator import itemgetter
lst = [{'modo': itemgetter(j)} for j in range(2)]
You have accidentally created what is know as a closure. The lambda functions in your second (loop-based) example include a reference to a variable j. That variable is actually the loop variable used to iterate your loop. So the lambda call actually produces code with a reference to "some variable named 'j' that I didn't define, but it's around here somewhere."
This is called "closing over" or "enclosing" the variable j, because even when the loop is finished, there will be this lambda function you wrote that references the variable j. And so it will never get garbage-collected until you release the references to the lambda function(s).
You get the same value (1, 1) printed because j stops iterating over the range(0,2) with j=1, and nothing changes that. So when your lambda functions ask for x[j], they're asking for the present value of j, then getting the present value of x[j]. In both functions, the present value of j is 1.
You could work around this by creating a make_lambda function that takes an index number as a parameter. Or you could do what #DavisYoshida suggested, and use someone else's code to create the appropriate closure for you.
Im taking an online beginner course through google on python 2, and I cannot figure out the answer to one of the questions. Here it is and thanks in advance for your help!
# A. match_ends
# Given a list of strings, return the count of the number of
# strings where the string length is 2 or more and the first
# and last chars of the string are the same.
# Note: python does not have a ++ operator, but += works.
def match_ends(words):
a = []
for b in words:
return
I tried a few different things. This is just where i left off on my last attempt, and decided to ask for help. I have spent more time thinking about this than i care to mention
def match_ends(words):
a = []
for b in words:
if (len(b) > 2 and b[0] == b[len(b)-1]):
a.append(b)
return a
def match_ends2(words):
return [x for x in words if len(x) > 2 and x[0] == x[len(x)-1]]
print(match_ends(['peter','paul','mary','tibet']))
print(match_ends2(['peter','paul','mary','tibet']))
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
my question is the following: I have a file which contains around 70 strings, all of them have 6 characters (either a,c,g or t for every position -> these are short DNA-sequences).
For example:
accggt agctta gggatc gactta ccttgg
What I need are the strings which are completely unique. Which have on every position a different character (base) compared with the other strings.
In this case I would get two matches (I define them as lists but this is only an idea for the output format):
[accggt , gggatc]
[gggatc , ccttgg]
The elements of list one are on every position different and so are also the elements of list 2.
Is there a build-in function which can do it? I also thought of regular expression but I'm not that familar with this approach.
Thanks in advance!
Edit:
Ok, it seems it is not that easy to describe. So lets go into more detail:
Let's take the five strings mentioned above:
I would start to compare the first string with all the other strings and then continue with string 2 comparing with all other strings and so on.
The first character of the first string is an a.
The first character of the second string is also an a.
This means I would discard the second string.
The first character of the third string is an g.
Fine.
The second character of the first string is an c.
The second character of the third string is an g.
Fine.
The third character of the first string is an c.
The third character of the third string is an g.
Fine.
The fourth ... and so on.
And if all characters of a string are different from the characters of another string (on every position like described above) I would keep those two strings and would search for the next strings which are different on every position compared to the strings I already found. Because I only have four letters there should be only four possibilities fo different strings.
I should end up with, probably a list, which contains the groups of strings which are different in every position.
I hope this helps.
You can use the following algorithm: iterate through all possible word combinations in your string and check each pair for equality with if [x == y for (x, y) in zip(word, nextWord)].count(True) == 0:.
Here is a snippet:
s = "accggt agctta gggatc gactta ccttgg"
chks = s.split(" ");
for word in chks:
for nextWord in chks:
if word != nextWord:
if [x == y for (x, y) in zip(word, nextWord)].count(True) == 0:
print([word, nextWord])
Result of the IDEONE demo:
['accggt', 'gggatc']
['gggatc', 'accggt']
['gggatc', 'ccttgg']
['ccttgg', 'gggatc']
UPDATE
You can deduplicate the list with a custom function. Here is an updated snippet:
def dedup(lst):
seen = set()
result = []
for item in lst:
fs = frozenset(item)
if fs not in seen:
result.append(item)
seen.add(fs)
return result
res = []
s = "accggt agctta gggatc gactta ccttgg"
chks = s.split(" ");
for word in chks:
for nextWord in chks:
if word != nextWord:
if [x == y for (x, y) in zip(word, nextWord)].count(True) == 0:
res.append([word, nextWord])
print(dedup(res))
Result: [['accggt', 'gggatc'], ['gggatc', 'ccttgg']].
To check the words by 3, you need to create all possible permutations of the string into 3-word combinations and use something like:
from itertools import permutations
def dedup(lst):
seen = set()
result = []
for item in lst:
fs = frozenset(item)
if fs not in seen:
result.append(item)
seen.add(fs)
return result
res = []
s = "accggt agctta gggatc gactta ccttgg"
chks = s.split(" ");
perms = [p for p in permutations(chks, 3)]
for perm in perms:
if [(x == y or y == z or x == z) for (x, y, z) in zip(*perm)].count(True) == 0:
res.append(perm)
print(dedup(res))
To find the DNA strings which are completely different on every character you have to check every string against any other string if any character of the given string is the same character on the same position in the comparing string.
Here is an example code for that:
# read all dna strings into a list of strings
dna = ['accggt', 'agctta', 'gggatc', 'gactta', 'ccttgg', '123456']
def compare_two_dna(dna1, dna2):
i = 0
l = len(dna1)
while(i < l):
if dna1[i] == dna2[i]:
return True
i += 1
return False
def is_dna_unique(d, dna_strings):
return len(filter(lambda x: compare_two_dna(d, x), dna_strings)) == 1
# filter all items which only occure once in the list
unique_dna = filter(lambda d: is_dna_unique(d, dna), dna)
print(unique_dna)
The result here is: 123456
var dnaList = "accggt agctta gggatc gactta ccttgg".split( " " );
function getUniqueDnas( dna_list ){
var result = [];
for( var d1 in dna_list ){
var isRepeat = false;
var dna1 = dna_list[ d1 ];
for( var d2 in dna_list ){
var dna2 = dna_list[ d2 ];
if( dna1 == dna2 ){
isRepeat = true;
break;
}
}
if( !isRepeat )
result.push( dna1 );
}
return result;
}
var uniqueDnaList = getUniqueDnas( dnaList );
This is not homework, but an old exam question. I am curious to see the answer.
We are given an alphabet S={0,1,2,3,4,5,6,7,8,9,+}. Define the language L as the set of strings w from this alphabet such that w is in L if:
a) w is a number such as 42 or w is the (finite) sum of numbers such as 34 + 16 or 34 + 2 + 10
and
b) The number represented by w is divisible by 3.
Write a regular expression (and a DFA) for L.
This should work:
^(?:0|(?:(?:[369]|[147](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\
+)*[369]0*)*\+?(?:0\+)*[258])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]|0*(?:
\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])|[
258](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0
\+)*[147])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]|0*(?:\+?(?:0\+)*[369]0*)
*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]))0*)+)(?:\+(?:0|(?:(?
:[369]|[147](?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)
*\+?(?:0\+)*[258])*(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]|0*(?:\+?(?:0\+)*
[369]0*)*\+?(?:0\+)*[147]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])|[258](?:0*(?
:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147])*
(?:0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[147]|0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)
*[258]0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*[258]))0*)+))*$
It works by having three states representing the sum of the digits so far modulo 3. It disallows leading zeros on numbers, and plus signs at the start and end of the string, as well as two consecutive plus signs.
Generation of regular expression and test bed:
a = r'0*(?:\+?(?:0\+)*[369]0*)*\+?(?:0\+)*'
b = r'a[147]'
c = r'a[258]'
r1 = '[369]|[147](?:bc)*(?:c|bb)|[258](?:cb)*(?:b|cc)'
r2 = '(?:0|(?:(?:' + r1 + ')0*)+)'
r3 = '^' + r2 + r'(?:\+' + r2 + ')*$'
r = r3.replace('b', b).replace('c', c).replace('a', a)
print r
# Test on 10000 examples.
import random, re
random.seed(1)
r = re.compile(r)
for _ in range(10000):
x = ''.join(random.choice('0123456789+') for j in range(random.randint(1,50)))
if re.search(r'(?:\+|^)(?:\+|0[0-9])|\+$', x):
valid = False
else:
valid = eval(x) % 3 == 0
result = re.match(r, x) is not None
if result != valid:
print 'Failed for ' + x
Note that my memory of DFA syntax is woefully out of date, so my answer is undoubtedly a little broken. Hopefully this gives you a general idea. I've chosen to ignore + completely. As AmirW states, abc+def and abcdef are the same for divisibility purposes.
Accept state is C.
A=1,4,7,BB,AC,CA
B=2,5,8,AA,BC,CB
C=0,3,6,9,AB,BA,CC
Notice that the above language uses all 9 possible ABC pairings. It will always end at either A,B,or C, and the fact that every variable use is paired means that each iteration of processing will shorten the string of variables.
Example:
1490 = AACC = BCC = BC = B (Fail)
1491 = AACA = BCA = BA = C (Success)
Not a full solution, just an idea:
(B) alone: The "plus" signs don't matter here. abc + def is the same as abcdef for the sake of divisibility by 3. For the latter case, there is a regexp here: http://blog.vkistudios.com/index.cfm/2008/12/30/Regular-Expression-to-determine-if-a-base-10-number-is-divisible-by-3
to combine this with requirement (A), we can take the solution of (B) and modify it:
First read character must be in 0..9 (not a plus)
Input must not end with a plus, so: Duplicate each state (will use S for the original state and S' for the duplicate to distinguish between them). If we're in state S and we read a plus we'll move to S'.
When reading a number we'll go to the new state as if we were in S. S' states cannot accept (another) plus.
Also, S' is not "accept state" even if S is. (because input must not end with a plus).