find index based on first element in a nested list - list

I have a list that contains sublists. The sequence of the sublist is fixed, as are the number of elements.
schedule = [['date1', 'action1', beginvalue1, endvalue1],
['date2', 'action2', beginvalue2, endvalue2],
...
]
Say, I have a date and I want find what I have to do on that date, meaning I require to find the contents of the entire sublist, given only the date.
I did the following (which works): I created a intermediate list, with all the first values of the sublists. Based on the index i was able to retrieve its entire contents, as follows:
dt = 'date150' # To just have a value to make underlying code more clear
ls_intermediate = [item[0] for item in schedule]
index = ls_intermediate.index(dt)
print(schedule[index])
It works but it just does not seem the Python way to do this. How can I improve this piece of code?
To be complete: there are no double 'date' entries in the list. Every date is unique and appears only once.
Learning Python, and having quite a journey in front of me...
thank you!

Related

Check an Array for latest timestamp

[["a","some_variable_data","01.02.2021"]
["a","some_variable_data","01.03.2021"]
["a","some_variable_data","01.04.2021"]
["a","some_variable_data","11.02.2021"]
["b","some_variable_data","01.02.2020"]
["b","some_variable_data","01.03.2020"]
["b","some_variable_data","01.04.2020"]
["b","some_variable_data","11.02.2020"]]
i have to check the latest timestamp for each first array field and add this to the rows. so the result should look like:
[["a","some_variable_data","01.02.2021"]
["a","some_variable_data","01.03.2021"]
["a","some_variable_data","01.04.2021","latest"]
["a","some_variable_data","11.02.2021"]
["b","some_variable_data","01.02.2020"]
["b","some_variable_data","01.03.2020"]
["b","some_variable_data","01.04.2020","latest"]
["b","some_variable_data","11.02.2020"]]
i need some help/hint how to realize this. can anybody help me? i have to use python 2.7
I'm not sure if sorted([ that array ]) works, but you can try that.
Lets say: array = [ that bigger array you shared ].
We know the location of the dates: the 3rd element (index 2). It's also sorted on the first element of the list: "a" or "b"
A possible option is to split the array into two arrays: one starting with "a" and one with "b".
Next, you can sort on the array_a[2] element in the list. Depending on the date and month, you can see if it was before, or after.
At the end, you simply merge the big arrays together:
print([1,2,3]+[4,5,6]) # [1,2,3,4,5,6]
def latest(n):
n = list(filter(None, n)) #delete_empty
n.sort(key = lambda date: datetime.strptime(date, '%d.%m.%Y')) #sort date
n=n[::-1] #reverse list
#print(n)
latest=""
for lp in n:
latest=lp
break
return latest
found a way:
build list for each first key and sort them as datetime, then pick latest and rebuild the initial list

Attempting to splice a recurring item out of a list

I have extracted files from an online database that consist of a roughly 100 titles. Associated with each of these titles is a DOI number, however, the DOI number is different for each title. To program this endeavor, I converted the contents of the website to a list. I then created for loop to iterate through each item of the list. What I want the program to do is iterate through the entire list and find where it says "DOI:" then to take the number which follows this. However, with the for loop I created, all it seems to do is print out the first DOI number, then terminates. How to I make the loop keep going once I have found the first one.
Here is the code:
resulttext = resulttext.split()
print(resulttext)
for item in resulttext:
if item == "DOI:":
DOI=resulttext[resulttext.index("DOI:")+1] #This parses out the DOI, then takes the item which follows it
print(DOI)

compare two dictionary, one with list of float value per key, the other one a value per key (python)

I have a query sequence that I blasted online using NCBIWWW.qblast. In my xml blast file result I obtained for a query sequence a list of hit (i.e: gi|). Each hit or gi| have multiple hsp. I made a dictionary my_dict1 where I placed gi| as key and I appended the bit score as value. So multiple values for each key.
my_dict1 = {
gi|1002819492|: [437.702, 384.47, 380.86, 380.86, 362.83],
gi|675820360| : [2617.97, 2614.37, 122.112],
gi|953764029| : [414.258, 318.66, 122.112, 86.158],
gi|675820410| : [450.653, 388.08, 386.27] }
Then I looked for max value in each key using:
for key, value in my_dict1.items():
max_value = max(value)
And made a second dictionary my_dict2:
my_dict2 = {
gi|1002819492|: 437.702,
gi|675820360| : 2617.97,
gi|953764029| : 414.258,
gi|675820410| : 450.653 }
I want to compare both dictionary. So I can extract the hsp with the highest score bits. I am also including other parameters like query coverage and identity percentage (Not shown here). The finality is to get the best gi| with the highest bit scores, coverage and identity percentage.
I tried many things to compare both dictionary like this :
First code :
matches[]
if my_dict1.keys() not in my_dict2.keys():
matches[hit_id] = bit_score
else:
matches = matches[hit_id], bit_score
Second code:
if hit_id not in matches.keys():
matches[hit_id]= bit_score
else:
matches = matches[hit_id], bit_score
Third code:
intersection = set(set(my_dict1.items()) & set(my_dict2.items()))
Howerver I always end up with 2 types of errors:
1 ) TypeError: list indices must be integers, not unicode
2 ) ... float not iterable...
Please I need some help and guidance. Thank you very much in advance for your time. Best regards.
It's not clear what you're trying to do. What is hit_id? What is bit_score? It looks like your second dict is always going to have the same keys as your first if you're creating it by pulling the max value for each key of the first dict.
You say you're trying to compare them, but don't really state what you're actually trying to do. Find those with values under a certain max? Find those with the highest max?
Your first code doesn't work because I'm assuming you're trying to use a dict key value as an index to matches, which you define as a list. That's probably where your first error is coming from, though you haven't given the lines where the error is actually occurring.
See in-code comments below:
# First off, this needs to be a dict.
matches{}
# This will never happen if you've created these dicts as you stated.
if my_dict1.keys() not in my_dict2.keys():
matches[hit_id] = bit_score # Not clear what bit_score is?
else:
# Also not sure what you're trying to do here. This will assign a tuple
# to matches with whatever the value of matches[hit_id] is and bit_score.
matches = matches[hit_id], bit_score
Regardless, we really need more information and the full code to figure out your actual goal and what's going wrong.

Why does random.sample() add square brackets and single quotes to the item sampled?

I'm trying to sample an item (which is one of the keys in a dictionary) from a list and later use the index of that item to find its corresponding value (in the same dictionary).
questions= list(capitals.keys())
answers= list(capitals.values())
for q in range(10):
queswrite = random.sample(questions,1)
number = questions.index(queswrite)
crtans = answers[number]
Here,capitals is the original dectionary from which the states(keys) and capitals(values) are being sampled.
But,apparently random.sample() method adds square brackets and single quotes to the sampled item and thus prevents it from being used to reference the list containing the corresponding values.
Traceback (most recent call last):
File "F:\test.py", line 30, in
number = questions.index(queswrite)
ValueError: ['Delaware'] is not in list
How can I prevent this?
random.sample() returns a list, containing the number of elements you requested. See the documentation:
Return a k length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.
If you wanted to pick just one element, you don't want a sample however, you wanted to choose just one. For that you'd use the random.choice() function instead:
question = random.choice(questions)
However, given that you are using a loop, you probably really wanted to get 10 unique questions. Don't use a loop over range(10), instead pick a sample of 10 random questions. That's exactly what random.sample() would do for you:
for question in random.sample(questions, 10):
# pick the answer for this question.
Next, putting both keys and values into two separate lists, then using the index of one to find the other is... inefficient and unnecessary; the keys you pick can be used directly to find the answers:
questions = list(capitals)
for question in random.sample(questions, 10):
crtans = capitals[question]

adding frequency elt to a list of list for each elt in python

I have a list of elements (list) wich if formatted like this:
3823,La Canebiere,LOCATION
3949,La Canebiere,LOCATION
3959,Phocaeans,LOCATION
3990,Paris,LOCATION
323,Paris,LOCATION
3222,Paris,LOCATION
Some location names (elt[1]) may appear two or more times in the list, but with a different elt[0](id number). What I'm trying to achieve is adding frequency of elt[1] in the list, adding this frequency to each elt and discarding noun (elt[1]) duplicates. For my example the new tab would be :
3823,La Canebiere,LOCATION, 2
3959,Phocaeans,LOCATION,1
3990,Paris,LOCATION,3
I tried a count method and dictionnary for counting frequency but I don't know how to create the new list that would maintain the original list (without duplicates) plus the frequency. I'm using python 3. Thank you in advance if you can help !