removing cyclic substrings from python list - python-2.7

I have a Python list like the following:
['IKW', 'IQW', 'IWK', 'IWQ', 'KIW', 'KLW', 'KWI', 'KWL', 'LKW', 'LQW', 'LWK', 'LWQ', 'QIW', 'QLW', 'QWI', 'QWL', 'WIK', 'WIQ', 'WKI', 'WKL', 'WLK', 'WLQ', 'WQI', 'WQL']
If we pick, say the second element IQW, we see that the list has duplicates of this item HOWEVER its not noticeable right away. This is because it is cyclic. I mean the following are equivalent.
IQW, QWI, WIQ
Also it could be backwards which is also a duplicate so I want it removed. So now the list of duplicates are (the reverse of each of one these)
IQW, QWI, WIQ , WQI, IWQ, QIW
So essentially I would like IQW to be the only one left.
Bonus points, if the one that is remaining in the list is sorted alphabetically.
The way I did was to sort the entire list by alphabetical order:
`IQW`, `QWI`, `WIQ` , `WQI`, `IWQ`, `QIW` ->
`IQW`, `IQW`, `IQW`, `IQW`, `IQW` `IQW`
and then remove the duplicates.
However this also removes combinations say i have ABCD and CDAB. These are not the same because the ends only meet once. But my method will sort them to ABCD and ABCD and remove one.
My code:
print cur_list
sortedlist = list()
for i in range(len(cur_list)):
sortedlist.append(''.join(map(str, sorted(cur_list[i]))))
sortedlist = set(sortedlist)

L = ['IKW', 'IQW', 'IWK', 'IWQ', 'KIW', 'KLW', 'KWI', 'KWL', 'LKW', 'LQW', 'LWK', 'LWQ', 'QIW', 'QLW', 'QWI', 'QWL', 'WIK', 'WIQ', 'WKI', 'WKL', 'WLK', 'WLQ', 'WQI', 'WQL']
seen = set()
res = []
for item in L:
c = item.index(min(item))
item = item[c:] + item[:c]
if item not in seen:
seen.add(item)
seen.add(item[0]+item[-1:0:-1])
res.append(item)
print res
output:
['IKW', 'IQW', 'KLW', 'LQW']

Here is the solution I coded: If anyone has a better algo, I will accept that as answer:
mylist = list()
for item in copy_of_cur:
linear_peptide = item+item
mylist = filter(lambda x: len(x) == 3 , subpeptides_linear(linear_peptide))
for subitem in mylist:
if subitem != item:
if subitem in cur_list:
cur_list.remove(subitem)

Related

Python, FOR looping - creating lists

This is my code to create lists, but its so brutal and inelegant, you guys have some idea to make it much smoother?
Thing is, I want to write code, where you could create your own lists, choose how many of them you want to create and how much items each should have - NOT using while loop. I can manage creating certain number of lists by inputing the range in for loop (number_of_lists)
i = 0
number_of_lists = input('How many lists you want to make? >')
for cycle in range(number_of_lists): #this was originaly range(3),
item1 = raw_input('1. item > ') #and will only work now pro-
item2 = raw_input('2. item > ') #perly, if n_o_l is exact. 3
item3 = raw_input('3. item > ')
#everything is wrong with this
print "-------------------" #code, i need it much more au-
#tonomous, than it is now.
if i == 0:
list1 = [item1, item2, item3]
if i == 1:
list2 = [item1, item2, item3]
if i == 2:
list3 = [item1, item2, item3]
i += 1
print list1
print list2
print list3
Thing is I also want to avoid all that 'if i == int' thing.
Now it will only create 3 lists, right, because instead of number_of_lists i originally used integer 3 to make 3 lists.
Now you see my problem I hope. I need to create new lists from input and name them if possible, so instead of list1 i can name it DOGS or w/e.
I need it all much more simple and interconnected, I hope you understand my problem and maybe have some smooth solution, thanks :)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Ok, I think I got it now - this is new version, doing pretty much what i want it to do:
number_of_lists = input('How many lists you want to make? >')
allItems = []
for cycle in range(int(number_of_lists)):
items = []
number_of_items = input('How much items in this list? >')
for i in range(int(number_of_items)):
item = raw_input(str(i+1) + ". item > ")
items.append(item)
allItems.append(items)
print("-------------------")
print allItems
If anyone has idea how to make this more effective and clear, let me know here! :) thanks for help guyz
You can add your lists to another list, that way it's dynamic like you want. Example below:
number_of_lists = input('How many lists you want to make? >')
allItems = []
for cycle in range(int(number_of_lists)):
items = []
for i in range(1, 4):
item = input(str(i) + ".item > ")
items.append(item)
allItems.append(items)
print("-------------------")
for items in allItems:
for item in items:
print(item)
print("-------------")
You'd still need to check if number_of_lists is an int before parsing it into an int. If the user types a letter it will throw an error.

Can I assign position of item in list?

ex = ['$5','Amazon','spoon']
I want to re-order this list, website - item - price.
Can I assign the index, for instance, ex.index('Amazon') = 1?
I'd like the result to be ['Amazon','spoon','$5']
I found information on how to swap positions, but I would like to know if I can assign an index for each item myself.
You cannot assign an index to an item, but you can build a permuted list according to a permutation pattern:
ex = ['$5','Amazon','spoon']
order = [1, 2, 0]
ex_new = [ex[i] for i in order]
print(ex_new)
#['Amazon', 'spoon', '$5']
Alternatively, you can overwrite the original list in place:
ex[:] = [ex[i] for i in order]
print(ex)
#['Amazon', 'spoon', '$5']

print elements in list in reversed order and on a new line

I want to print all elements in this list in reversed order and every element in this list must be on a new line.
For example if the list is ['i', 'am', 'programming', 'with', 'python'] it should print out:
python
with
programming
am
i
What is the best way to do this?
def list():
words = []
while True:
output = input("Type a word: ")
if output == "stop":
break
else:
words.append(output)
for elements in words:
print(elements)
list()
generic :
for(i=wordsArry.size();i--;i<0){
pritnln(wordsArry[i]+"/n")
}
Start iteration from the end of the list - last element in it.
Then iterate backwards - decrease iterator by 1 till you reach 0.
Print each element plus new line symbol - might depend on OS,
language.
In Python you can reverse a list in place by:
words = ['i', 'am', 'programming', 'with', 'python']
words.reverse()
for w in words:
print(w)
If you want to iterate in reverse but keep the original order, you can use a slice:
for w in words[::-1]:
print(w)
Slice syntax is [begin:end:step], where begin (incl) and end (excl) indices are omitted (get all elements) and the step -1 returns a slice with the elements in reverse order.
Both methods produce the same output.

Multiple lists of the same length to csv

I have a couple List<string>s, with the format like this:
List 1 List 2 List 3
1 A One
2 B Two
3 C Three
4 D Four
5 E Five
So in code form, it's like:
List<string> list1 = {"1","2","3","4","5"};
List<string> list2 = {"A","B","C","D","E"};
List<string> list3 = {"One","Two","Three","Four","Five"};
My questions are:
How do I transfom those three lists to a CSV format?
list1,list2,list3
1,A,one
2,b,two
3,c,three
4,d,four
5,e,five
Should I append , to the end of each index or make the delimeter its own index within the multidimensional list?
If performance is your main concern, I would use an existing csv library for your language, as it's probably been pretty well optimized.
If that's too much overhead, and you just want a simple function, I use the same concept in some of my code. I use the join/implode function of a language to create a list of comma separated strings, then join that list with \n.
I'm used to doing this in a dynamic language, but you can see the concept in the following pseudocode example:
header = {"List1", "List2", "List3"}
list1 = {"1","2","3","4","5"};
list2 = {"A","B","C","D","E"};
list3 = {"One","Two","Three","Four","Five"};
values = {header, list1, list2, list3};
for index in values
values[index] = values[index].join(",");
values = values.join("\n");

Getting duplicates in dict

I have a dictionary, say d1 that looks like this:
d = {'file1': 4098, 'file2': 4139, 'file3': 4098, 'file4': 1353, 'file5': 4139}
Now, I've figured out how to get it to tell me if there are any dublicates or not. But what I'd like to get it to do is tell me if there are any, and what 2 (or more) values (and corresponding keys) are dublicates.
The output for the above would tell me that file1 and file3 are identical and that file2 and file5 are identical
I've been trying to wrap my head around it for a few hours, and haven't found the right solution yet.
try this to get the duplicates:
[item for item in d.items() if [val for val in d.values()].count(item[1]) > 1]
that outputs:
[('file3', 4098), ('file2', 4139), ('file1', 4098), ('file5', 4139)]
next sort the list by the second item in the tuple:
list = sorted(list, key=operator.itemgetter(1))
finally use itertools.groupby() to group by the second item:
list = [list(group) for key, group in itertools.groupby(list, operator.itemgetter(1))]
final output:
[[('file3', 4098), ('file1', 4098)], [('file2', 4139), ('file5', 4139)]]