Merging 2 lists of dicts based on common values - list

So I have 2 lists of dicts in Python as follows:
list1 = [
{
"medication_name": "Actemra IV",
"growth": 0,
"total_prescriptions": 3
},
{
"medication_name": "Actemra SC",
"growth": 0.0,
"total_prescriptions": 2
},
{
"medication_name": "Adempas",
"growth": 0,
"total_prescriptions": 1
}
]
list2 = [
{
"medication_name": "Actemra IV",
"fulfillment_time": 94340
},
{
"medication_name": "Actemra SC",
"fulfillment_time": 151800
},
{
"medication_name": "Adempas",
"fulfillment_time": 156660
}
]
What I would want is to have list1 appended with the fulfillment_time key from list2 so that the output is as follows:
[
{
"medication_name": "Actemra IV",
"growth": 0,
"fulfillment_time": 94340,
"total_prescriptions": 3
},
{
"medication_name": "Actemra SC",
"growth": 0.0,
"fulfillment_time": 151800,
"total_prescriptions": 2
},
{
"medication_name": "Adempas",
"growth": 0,
"fulfillment_time": 156660,
"total_prescriptions": 1
}
]
I achieved this in the traditional way of looping over both lists as follows:
for i in list1:
for j in list2:
if i['medication_name'] == j['medication_name']:
i['fulfillment_time'] = j['fulfillment_time']
What I wanted to know is that are there any inbuilt one line functions already in python that perform the same task that I may not know of ?

There is no "one line" way to do what you want, mainly because it's not a very natural operation: the data structures you are using don't naturally allow the operations you want to do. This is also borne out by the fact that your algorithm is rather inefficient: it loops all the way through list2 for every element of list1. It is quadratic in the number of elements.
It seems like you are thinking of the 'medication_name' as the key for the dictionaries in the list. But the list type provides no operation to find elements by that key.
A more pythonic approach would be to convert the list into a dictionary: then finding the right dictionary will become O(1). Something like this:
d = { i['medication_name'] : i for i in list2}
for i in list1:
i.update(d[i['medication_name']])
Python does provide a one-liner to merge the dictionaries, as shown.
This does raise the question of what you want to do if list2 contains no entry for one of the entries in list1: a try/except could be used to deal with that.
These data structures are a little bit "database-like". Perhaps you should be using sqlite3?

Related

Length of List with conditionals in dart

Is there a way to get 4 instead of 2 as a result?
List<String> test = [
'first',
'second',
if (false) 'third',
if (false) 'fourth',
];
print('length: ' + test.length.toString());
The length property on lists returns the number of elements in the list. In you example you are only inserting two values (because of the condition) so a length of 4 would not make sense and would give problems when you e.g. want to iterate over the list.
You can however add null elements if the condition are false like this:
void main() {
List<String> list = [
'first',
'second',
(false) ? 'third' : null,
(false) ? 'fourth' : null,
];
final listLengthIncludingConditions = list.length;
list.removeWhere((x) => x != null);
print('Number of possible elements in list: $listLengthIncludingConditions'); // 4
print('Number of elements in list: ${list.length}'); // 2
}
You can then save the length and remove the null elements.

NetLogo: How to compare two sublists

I am not from a computer science background and I am also new to NetLogo, so I would really appreciate your help. My question is as follows:
Let assume that I have a list of three lists
Let mylist [ [ 1 2 3 4 5 ] [ 2 2 2 2 2 ] [ 3 3 3 3 3 ] ]
I would like to check each item within item 2 mylist (i.e. [ 3 3 3 3 3 ]) and see if it is not equal to the corresponding item in item 0 mylist (i.e. [ 1 2 3 4 5 ]). If that the case, I would like to subtract a constant value which is 5 from that item in item 2 mylist.
In other words, I would like mylist to be changed to the following:
[ [ 1 2 3 4 5 ] [ 2 2 2 2 2 ] [ -2 -2 3 -2 -2 ] ]
Thanks in advance,
Your own answer is fine, but here is what I consider to be a somewhat more elegant way to do it:
print lput (map [ [ a b ] ->
ifelse-value (a = b) [ b ] [ b - 5 ]
] (item 0 mylist) (item 2 mylist)) but-last mylist
The key primitive here is map, which is to foreach what of is to ask: instead of running a command on each element, it runs a reporter and builds a new list out of the results. In this particular case, it saves you from having to mess with indices and replace-item inside.
The combination of lput and but-last makes it easy to replace the last sublist in your main list, but you could also use replace-item for that. Or, depending on what you need it for, you could just use the result of map directly instead of putting it back in your main list.
I managed to solve the problem by separating the sublists:
to go
Let mylist [ [ 1 2 3 4 5 ] [ 2 2 2 2 2 ] [ 3 3 3 3 3 ] ]
let auxiliar-list1 item 0 mylist
let auxiliar-list2 item 2 mylist
foreach ( range 0 ( length auxiliar-list1 ) ) [ num-item ->
if item num-item auxiliar-list1 != item num-item auxiliar-list2 [
set auxiliar-list2 replace-item num-item auxiliar-list2 (item num-item auxiliar-list2 - 5)
show mylist
show auxiliar-list1
show auxiliar-list2
]
]
end

Merging python dictionaries differently

I've python dictionaries within a list as follows
[{"item1": {"item2": "300", "item3" : "10"}},
{"item2": { "item4": "90", "item5": "400" }},
{"item5": {"item6": "16"}},
{"item3": {"item8": "ava", "item1" : "xxx","item5": "400"}}]
And I want to construct a dictionary as follows
{
"item1" : {
"item2": "300",
"item4": "90",
"item5": "400",
"item6": "16",
"item3" : "10",
"item8": "ava"
}
Traversing method:
1) Starting with item1 => is a dict with two keys. Add it to the new dict.
2) Then take the first key in the first dict and check if there is any dictionary with this key. item2 is again a dict with two keys and hence add those keys to the new dict.
3) Repeat the same for the keys in item2 until nothing is there to traverse down. While traversing if there is already traversed dict found skip that. For eg, in the last dict item3, we have item5 which is already traversed and added to new dict. So we can skip traversing this key.
4) Repeat step 2, for second key in item1 (Has to be repeated depends upon the number of keys in first dict.)
I know this is more complex. Is there any possibility to achieve this?

using mapreduce to find common items between users

Suppose I have the following user/item sets where items could also be replicates for each user (like user1)
{ "u1", "item" : [ "a", "a", "c","h" ] }
{ "u2", "item" : [ "b", "a", "f" ] }
{ "u3", "item" : [ "a", "a", "f" ] }
I want to find a map-reduce algorithm that will calculate the number of common items between each pair of users some like that
{ "u1_u2", "common_items" : 1 }
{ "u1_u3", "common_items" : 2 }
{ "u2_u3", "common_items" : 2 }
It basically finds the intersections of itemsets for each pair and considers replicates as new items. I am new to mapreduce, how can I do a map-reduce for this?
With these sorts of problems, you need to appreciate that some algorithms will scale better than others, and performance of any one algorithm will depend on the 'shape' and size of your data.
Comparing the item sets for every user to every other user might be appropriate for small domain datasets (say 1000's or users, maybe even 10,000's, with a similar number of items), but is an 'n squared' problem (or an order of thereabouts, my Big O is rusty to say the least!):
Users Comparisons
----- -----------
2 1
3 3
4 6
5 10
6 15
n (n^2 - n)/2
So a user domain of 100,000 would yield 4,999,950,000 set comparisons.
Another approach to this problem, would be to inverse the relationship, so run a Map Reduce job to generate a map of items to users:
'a' : [ 'u1', 'u2', 'u3' ],
'b' : [ 'u2' ],
'c' : [ 'u1' ],
'f' : [ 'u2', 'u3' ],
'h' : [ 'u1' ],
From there you can iterate the users for each item and output user pairs (with a count of one):
'a' would produce: [ 'u1_u2' : 1, 'u1_u3' : 1, 'u2_u3' : 1 ]
'f' would produce: [ 'u2_u3' : 1 ]
Then finally produce the sum for each user pairing:
[ 'u1_u2' : 1, 'u1_u3' : 1, 'u2_u3' : 2 ]
This doesn't produce the behavior you are interested (the double a's in both u1 and u3 item sets), but details an initial implementation.
If you know your domain set typically has users which do not have items in common, a small number of items per user, or an item domain which has a large number of distinct values, then this algorithm will be more efficient (previously you were comparing every user to another, with a low probability of intersection between the two sets). I'm sure a mathematician could prove this for you, but that i am not!
This also has the same potential scaling problem as before - in that if you have an item that all 100,000 users all have in common, you still need to generate the 4 billion user pairs. This is why it is important to understand your data, before blindly applying an algorithm to it.
You want a step that emits all of the things the user has, like:
{ 'a': "u1" }
{ 'a': "u1" }
{ 'c': "u1" }
{ 'h': "u1" }
{ 'b': "u2" }
{ 'a': "u2" }
{ 'f': "u2" }
{ 'a': "u1" }
{ 'a': "u3" }
{ 'f': "u3" }
Then reduce them by key like:
{ 'a': ["u1", "u1", "u2", "u3"] }
{ 'b': ["u2"] }
{ 'c': ["u1"] }
{ 'f': ["u2", "u3"] }
{ 'h': ["u1"] }
And in that reducer emit the permutations of each user in each value, like:
{ 'u1_u2': 'a' }
{ 'u2_u3': 'a' }
{ 'u1_u3': 'a' }
{ 'u2_u3': 'f' }
Note that you'll want to make sure that in a key like k1_k2 that k1 < k2 so that they match up in any further mapreduce steps.
Then if if you need them all grouped like your example, another mapreduce phase to combine them by key and they'll end up like:
{ 'u1_u2': ['a'] }
{ 'u1_u3': ['a'] }
{ 'u2_u3': ['a', 'f'] }
{ 'u2_u3': ['f'] }
Does this work for you?
from itertools import combinations
user_sets = [
{ 'u1': [ 'a', 'a', 'c', 'h' ] },
{ 'u2': [ 'b', 'a', 'f' ] },
{ 'u3': [ 'a', 'a', 'f' ] },
]
def compare_sets(set1, set2):
sum = 0
for n, item in enumerate(set1):
if item in set2:
sum += 1
del set2[set2.index(item)]
return sum
for set in combinations(user_sets, 2):
comp1, comp2 = set[0], set[1]
print 'Common items bwteen %s and %s: %s' % (
comp1.keys()[0], comp2.keys()[0],
compare_sets(comp1.values()[0], comp2.values()[0])
)
Here's the output:
Common items bwteen u1 and u2: 1
Common items bwteen u1 and u3: 2
Common items bwteen u2 and u3: 1

Checking to see if a list of lists has equal sized lists

I need to validate if my list of list has equally sized lists in python
myList1 = [ [1,1] , [1,1]] // This should pass. It has two lists.. both of length 2
myList2 = [ [1,1,1] , [1,1,1], [1,1,1]] // This should pass, It has three lists.. all of length 3
myList3 = [ [1,1] , [1,1], [1,1]] // This should pass, It has three lists.. all of length 2
myList4 = [ [1,1,] , [1,1,1], [1,1,1]] // This should FAIL. It has three list.. one of which is different that the other
I could write a loop to iterate over the list and check the size of each sub-list. Is there a more pythonic way to achieve the result.
all(len(i) == len(myList[0]) for i in myList)
To avoid incurring the overhead of len(myList[0]) for each item, you can store it in a variable
len_first = len(myList[0]) if myList else None
all(len(i) == len_first for i in myList)
If you also want to be able to see why they aren't all equal
from itertools import groupby
groupby(sorted(myList, key=len), key=len)
Will group the lists by the lengths so you can easily see the odd one out
You could try:
test = lambda x: len(set(map(len, x))) == 1
test(myList1) # True
test(myList4) # False
Basically, you get the length of each list and make a set from those lengths, if it contains a single element then each list has the same length
def equalSizes(*args):
"""
# This should pass. It has two lists.. both of length 2
>>> equalSizes([1,1] , [1,1])
True
# This should pass, It has three lists.. all of length 3
>>> equalSizes([1,1,1] , [1,1,1], [1,1,1])
True
# This should pass, It has three lists.. all of length 2
>>> equalSizes([1,1] , [1,1], [1,1])
True
# This should FAIL. It has three list.. one of which is different that the other
>>> equalSizes([1,1,] , [1,1,1], [1,1,1])
False
"""
len0 = len(args[0])
return all(len(x) == len0 for x in args[1:])
To test it save it to a file so.py and run it like this:
$ python -m doctest so.py -v
Trying:
equalSizes([1,1] , [1,1])
Expecting:
True
ok
Trying:
equalSizes([1,1,1] , [1,1,1], [1,1,1])
Expecting:
True
ok
Trying:
equalSizes([1,1] , [1,1], [1,1])
Expecting:
True
ok
Trying:
equalSizes([1,1,] , [1,1,1], [1,1,1])
Expecting:
False
ok
If you want a little more data in failure cases, you could do:
myList1 = [ [1,1] , [1,1]]
lens = set(itertools.imap(len, myList1))
return len(lens) == 1
# if you have lists of varying length, at least you can get stats about what the different lengths are