i have two lists listA and listB of type object
ListA[name=abc, age=34, weight=0, height=0] data collected from excel sheet
ListB[name=null, age=0, weight=70, height=6] data collected from database
Now i want to combine both the lists into a single list
MergedList[name=abc, age=34, weight=70, height=6]
Note: my obj class has more than 15 properties so adding each property one by one using getProperty() will be time-consuming.is there a better way?
Convert them to a Map where the key is the name of the object ( you denoting the elements as name=abc suggests they are name/value pairs ).
Map<String,MyMysteriousObject> converted = list.stream().collect( Collectors.toMap(MyMysteriousObject::getName, Function.identity() ) );
( replace the getName with what ever function you use to get the name of your object )
And then just merge the maps. How to merge maps is described here for example.
While at it, consider replacing the List with Map in your entire code. Will surely save a lot of work elsewhere too.
But if you have to have a list again, just List<MyMysteriousObject> resultList = new ArrayList<>(resultMap);
Related
I have a list that contains sublists. The sequence of the sublist is fixed, as are the number of elements.
schedule = [['date1', 'action1', beginvalue1, endvalue1],
['date2', 'action2', beginvalue2, endvalue2],
...
]
Say, I have a date and I want find what I have to do on that date, meaning I require to find the contents of the entire sublist, given only the date.
I did the following (which works): I created a intermediate list, with all the first values of the sublists. Based on the index i was able to retrieve its entire contents, as follows:
dt = 'date150' # To just have a value to make underlying code more clear
ls_intermediate = [item[0] for item in schedule]
index = ls_intermediate.index(dt)
print(schedule[index])
It works but it just does not seem the Python way to do this. How can I improve this piece of code?
To be complete: there are no double 'date' entries in the list. Every date is unique and appears only once.
Learning Python, and having quite a journey in front of me...
thank you!
I have two separate folders containing 3D arrays (data), each folder contains files of the same classification. I used mxnet.gluon.data.ArrayDataset() create datasets for each label respectively. Is there a way to combine these two datasets into the final training dataset that combines both classifications? The new data sets are different size.
e.g
A_data = mx.gluon.data.ArrayDataset(list2,label_A )
noA_data = mx.gluon.data.ArrayDataset(list,label_noA)
^ I want to combine A_data and noA_data for a complete dataset.
Additionally, is there an easier way to combine the two folders with its classification into a mxnet dataset from the get-go? That would also solve my problem.
You could create an ArrayDataset that contains both, if list and list2 are both python lists then you could do something like
full_data = mx.gluon.data.dataset.ArrayDataset(list + list2, label_noA + labelA)
where len(label_noA) == len(list) and len(label_A) == len(list2)
I have two lists, one is a list of lists, and they have the same number of indexes(the half number of values), like this:
list1=[['47', '43'], ['299', '295'], ['47', '43'], etc.]
list2=[[9.649, 9.612, 9.42, etc.]
I want to detect the repeated pair of values in the same list(and delete it), and sum the values with the same indexes in the second list, creating an output like this:
list1=[['47', '43'], ['299', '295'], etc.]
list2=[[19.069, 9.612, etc.]
The main problem is that the order of the values is important and I'm really stuck.
You could create a collections.defaultdict to sum values together, with keys as the sublists (converted as tuple to be hashable)
list1=[['47', '43'], ['299', '295'], ['47', '43']]
list2=[9.649, 9.612, 9.42]
import collections
c = collections.defaultdict(float)
for l,v in zip(list1,list2):
c[tuple(l)] += v
print(c)
Alternative using collections.Counter and which does the same:
c = collections.Counter((tuple(k),v) for k,v in zip(list1,list2))
At this point, we have the related data:
defaultdict(<class 'float'>, {('299', '295'): 9.612, ('47', '43'): 19.069})
now if needed (not sure, since the dictionary holds the data very well) we can rebuild the lists, keeping the (relative) order between them (but not their original order, that shouldn't be a problem since they're still linked):
list1=[]
list2=[]
for k,v in c.items():
list1.append(list(k))
list2.append(v)
print(list1,list2)
result:
[['299', '295'], ['47', '43']]
[9.612, 19.069]
My data structure was original a big Map. But I read that we should not use too big maps, to not run out of atoms. So my new data structure looks like that.
countries = [[{'name', 'Germany'}, {'code', 'DE'}], [{'name', 'Austria'}, {'code', 'AT'}]]
I want to make a filter_by/3 method, to filter this nested list for the country list by attributes name or code
Should I transform the Tuples to Maps or is there another way to filter this?
You could use a list of maps. Maps are very performant when retrieving elements, especially when the keys in a map are very few.
In your example:
countries = [%{name: "Germany", code: "DE"},
%{name: "Austria", code: "AT"}]
Note that even if you'll use thousands of such maps in a list, you'll never run out of atoms since :name and :code will always be the only two allocated atoms (since each atom is exactly is value, so writing :a and :a is like writing 3 and 3).
Once you have a similar list in place, you can filter it with a function like:
def filter_by(countries, key, value) do
Enum.filter(countries, fn(country) -> country[key] == value end)
end
filter_by(countries, :code, "AT")
I have a list of lists in the format below. This is data coming from a csv and I am trying to emulate the data review function that excel has in python. The only reason I can't do it directly in excel is this document is almost 1GB and has 1.1 mil row.
((a1,b1,c1,d1,e1),(a1,b2,c1,d2,e2),(a1,b1,c2,d3,e3),(a2,b1,c1,d3,e4),(a2,b2,c2,d3,e5)...)
I want to convert it into a single data structure something like a multidimensional array. like below
((a1:(b1:(c1:(),c2:()),b2:(),b3:()),a2:(b1:(c1:()),b2:(c2:()),b3:())))
I use autovivify class for other purposes but I can't use it here because some of the keys I want to use are strings. Appreciate help here.
If I understand your question correctly, you want to transform that list into a tree-like structure, where each tuple in the list represents one path down the tree. You can do this using nested dictionaries:
def add_to_dict(d, t):
if t:
first, rest = t[0], t[1:]
nested = d.setdefault(first, {})
add_to_dict(nested, rest)
Given a dictionary d (initially empty) and one of those tuples t, if that tuple is not empty, it takes the first element from the tuple, adds a nested dictionary to the original dictionary using this element as key (or takes one that already exists in this place), and adds the rest of the tuple to that dictionary in the same way.
Example using your data:
data = (('a1','b1','c1','d1','e1'),
('a1','b2','c1','d2','e2'),
('a1','b1','c2','d3','e3'),
('a2','b1','c1','d3','e4'),
('a2','b2','c2','d3','e5'))
d = {}
for t in data:
add_to_dict(d, t)
The resulting dictionary d looks like this:
{'a1': {'b1': {'c1': {'d1': {'e1': {}}},
'c2': {'d3': {'e3': {}}}},
'b2': {'c1': {'d2': {'e2': {}}}}},
'a2': {'b1': {'c1': {'d3': {'e4': {}}}},
'b2': {'c2': {'d3': {'e5': {}}}}}}