hi, im new to python and i was wondering how would i go about adding values to lists based from the index of another list - python-3.1

new_list =[0,0,0,0]
for x_list in random_list: # list of list
for x in x_list:
if x == "I" or "i":
list_index = x_list.index(x)
new_list[list_index] += 1
Lets say the random_list was [['x','x','I','I'],['x','x','I','x']]
it should output [0,0,2,1], but it doesn't

You may be looking for the built-in enumerate, which allows you to iterate over the objects and indices at the same time. your inner loop could then be
for list_index, x in enumerate(x_list):
if x == "I" or "i":
new_list[list_index] += 1
and it should do what you want
PS
One more gotcha: if x == "I" or "i": should actually be if x == "I" or x == "i":

Related

Problem to print list of tuple for every sentence seperately

If we have situation like that
[(ali,noun),(ahmad, noun),(play , verb)], [(read, verb), (is, helping verb), (waqar, noun)]
I want to print only verb from these list of tuple but when i print it will show whole data list are merging?
Done with Answer
num = [x[0] for x in tagged if x[1] == 'VBG' or x[1] == 'VBD' or x[1] == 'VBN' or x[1]== 'VB' or x[1]=='JJ']

python 2.7 iterate on list printing subsets of a list

I have this list: l = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] and I would like to iterate on it, printing something like
0: [1,2,3,4]
1: [2,3,4,5]
2: [3,4,5,6]
...
n: [17,18,19,20]
So far I made this code to print 5 elements at a time, but the last iteration prints 3:
for index, item in enumerate(l):
if index == 0 or index == 1 or index == 2:
continue
print index, l[index - 3:index + 2]
How can I solve this?
You're on the right track with your list slices. Here's a tweak to make it easier:
sub_len = 4
for i in range(len(mylist) - sub_len + 1):
print l[i:i + sub_len]
Where sub_len is the desired length of the slices you print.
demo

x=5,6 if x==6 print 6 else print not 6

I have started learning python and using online interpreter for python 2.9-pythontutor
x=5,6
if x==5:
print "5"
else:
print "not"
It goes in else loop and print not.
why is that?
what exactly x=5,6 means?
, is tuple expr, where x,y will return a tuple (x,y)
so expression 5,6 will return a tuple (5,6)
x is nether 5 nor 6 but a tuple
When you declared x = 5, 6 you made it a tuple. Then later when you do x == 5 this translates to (5, 6) == 5 which is not true, so the else branch is run.
If instead you did x[0] == 5 that would be true, and print 5. Because we are accessing the 0 index of the tuple, which is equal to 5. Check out some tutorials on tuples for more info.
In Python when you write x = 4, 5, it is same as declaring a tuple as x = (4, 5). In interpreter, if you write:
>>> x = 4, 5
>>> x
(4, 5)
Hence, it is similar to comparing a tuple with an int.
X here acts as an array, where x is pointed to the first element of the array as x [0] = 5 and x [1] = 6
Execute this code, and the display will be 5
x=5,6
if x[0]==5:
print "5"
else:
print "not"
and try to See this link "http://www.pythontutor.com/visualize.html#mode=edit " you can run your code python step by step

Compare strings and just keep those who have on same positions different characters [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
my question is the following: I have a file which contains around 70 strings, all of them have 6 characters (either a,c,g or t for every position -> these are short DNA-sequences).
For example:
accggt agctta gggatc gactta ccttgg
What I need are the strings which are completely unique. Which have on every position a different character (base) compared with the other strings.
In this case I would get two matches (I define them as lists but this is only an idea for the output format):
[accggt , gggatc]
[gggatc , ccttgg]
The elements of list one are on every position different and so are also the elements of list 2.
Is there a build-in function which can do it? I also thought of regular expression but I'm not that familar with this approach.
Thanks in advance!
Edit:
Ok, it seems it is not that easy to describe. So lets go into more detail:
Let's take the five strings mentioned above:
I would start to compare the first string with all the other strings and then continue with string 2 comparing with all other strings and so on.
The first character of the first string is an a.
The first character of the second string is also an a.
This means I would discard the second string.
The first character of the third string is an g.
Fine.
The second character of the first string is an c.
The second character of the third string is an g.
Fine.
The third character of the first string is an c.
The third character of the third string is an g.
Fine.
The fourth ... and so on.
And if all characters of a string are different from the characters of another string (on every position like described above) I would keep those two strings and would search for the next strings which are different on every position compared to the strings I already found. Because I only have four letters there should be only four possibilities fo different strings.
I should end up with, probably a list, which contains the groups of strings which are different in every position.
I hope this helps.
You can use the following algorithm: iterate through all possible word combinations in your string and check each pair for equality with if [x == y for (x, y) in zip(word, nextWord)].count(True) == 0:.
Here is a snippet:
s = "accggt agctta gggatc gactta ccttgg"
chks = s.split(" ");
for word in chks:
for nextWord in chks:
if word != nextWord:
if [x == y for (x, y) in zip(word, nextWord)].count(True) == 0:
print([word, nextWord])
Result of the IDEONE demo:
['accggt', 'gggatc']
['gggatc', 'accggt']
['gggatc', 'ccttgg']
['ccttgg', 'gggatc']
UPDATE
You can deduplicate the list with a custom function. Here is an updated snippet:
def dedup(lst):
seen = set()
result = []
for item in lst:
fs = frozenset(item)
if fs not in seen:
result.append(item)
seen.add(fs)
return result
res = []
s = "accggt agctta gggatc gactta ccttgg"
chks = s.split(" ");
for word in chks:
for nextWord in chks:
if word != nextWord:
if [x == y for (x, y) in zip(word, nextWord)].count(True) == 0:
res.append([word, nextWord])
print(dedup(res))
Result: [['accggt', 'gggatc'], ['gggatc', 'ccttgg']].
To check the words by 3, you need to create all possible permutations of the string into 3-word combinations and use something like:
from itertools import permutations
def dedup(lst):
seen = set()
result = []
for item in lst:
fs = frozenset(item)
if fs not in seen:
result.append(item)
seen.add(fs)
return result
res = []
s = "accggt agctta gggatc gactta ccttgg"
chks = s.split(" ");
perms = [p for p in permutations(chks, 3)]
for perm in perms:
if [(x == y or y == z or x == z) for (x, y, z) in zip(*perm)].count(True) == 0:
res.append(perm)
print(dedup(res))
To find the DNA strings which are completely different on every character you have to check every string against any other string if any character of the given string is the same character on the same position in the comparing string.
Here is an example code for that:
# read all dna strings into a list of strings
dna = ['accggt', 'agctta', 'gggatc', 'gactta', 'ccttgg', '123456']
def compare_two_dna(dna1, dna2):
i = 0
l = len(dna1)
while(i < l):
if dna1[i] == dna2[i]:
return True
i += 1
return False
def is_dna_unique(d, dna_strings):
return len(filter(lambda x: compare_two_dna(d, x), dna_strings)) == 1
# filter all items which only occure once in the list
unique_dna = filter(lambda d: is_dna_unique(d, dna), dna)
print(unique_dna)
The result here is: 123456
var dnaList = "accggt agctta gggatc gactta ccttgg".split( " " );
function getUniqueDnas( dna_list ){
var result = [];
for( var d1 in dna_list ){
var isRepeat = false;
var dna1 = dna_list[ d1 ];
for( var d2 in dna_list ){
var dna2 = dna_list[ d2 ];
if( dna1 == dna2 ){
isRepeat = true;
break;
}
}
if( !isRepeat )
result.push( dna1 );
}
return result;
}
var uniqueDnaList = getUniqueDnas( dnaList );

Deleting duplicate x values and their corresponding y values

I am working with a list of points in python 2.7 and running some interpolations on the data. My list has over 5000 points and I have some repeating "x" values within my list. These repeating "x" values have different corresponding "y" values. I want to get rid of these repeating points so that my interpolation function will work, because if there are repeating "x" values with different "y" values it runs an error because it does not satisfy the criteria of a function. Here is a simple example of what I am trying to do:
Input:
x = [1,1,3,4,5]
y = [10,20,30,40,50]
Output:
xy = [(1,10),(3,30),(4,40),(5,50)]
The interpolation function I am using is InterpolatedUnivariateSpline(x, y)
have a variable where you store the previous X value, if it is the same as the current value then skip the current value.
For example (pseudo code, you do the python),
int previousX = -1
foreach X
{
if(x == previousX)
{/*skip*/}
else
{
InterpolatedUnivariateSpline(x, y)
previousX = x /*store the x value that will be "previous" in next iteration
}
}
i am assuming you are already iterating so you dont need the actualy python code.
A bit late but if anyone is interested, here's a solution with numpy and pandas:
import pandas as pd
import numpy as np
x = [1,1,3,4,5]
y = [10,20,30,40,50]
#convert list into numpy arrays:
array_x, array_y = np.array(x), np.array(y)
# sort x and y by x value
order = np.argsort(array_x)
xsort, ysort = array_x[order], array_y[order]
#create a dataframe and add 2 columns for your x and y data:
df = pd.DataFrame()
df['xsort'] = xsort
df['ysort'] = ysort
#create new dataframe (mean) with no duplicate x values and corresponding mean values in all other cols:
mean = df.groupby('xsort').mean()
df_x = mean.index
df_y = mean['ysort']
# poly1d to create a polynomial line from coefficient inputs:
trend = np.polyfit(df_x, df_y, 14)
trendpoly = np.poly1d(trend)
# plot polyfit line:
plt.plot(df_x, trendpoly(df_x), linestyle=':', dashes=(6, 5), linewidth='0.8',
color=colour, zorder=9, figure=[name of figure])
Also, if you just use argsort() on the values in order of x, the interpolation should work even without the having to delete the duplicate x values. Trying on my own dataset:
polyfit on its own
sorting data in order of x first, then polyfit
sorting data, delete duplicates, then polyfit
... I get the same result twice