df.info() doesn't show any information about the dataframe

df.info() doesn't show any information about the dataframe - python-2.7

When I run
print df
the result
A B C D
0 4 8 4-a
7 3 5 3-b
when I select only one column
print df['D']
Nothing showing
print df.info()
Nothing showing
I couldn't understant what is wrong?
I set the data using this code
import pandas as pd
data = {'A': {0: 0, 1: 4, 2: 5, 3: 6, 4: 7, 5: 7, 6: 6},
'B': {0: 's', 1: 's', 2: 's', 3: 's', 4: 's', 5: 's', 6: 's'},
'C': {0: 3, 1: 2, 2: 2, 3: 1, 4: 2, 5: 3, 6: 0},
'D': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'b', 5: 'b', 6: 'b'}}
df = pd.DataFrame(data)
# Handling column A (first index per value in D)
output_df = df.drop_duplicates(subset='D', keep='first')
# Itering through rows
for index, row in output_df.iterrows():
#Calcultating the counts in B
output_df.loc[index, 'B'] = df[df.D == row.D].B.count()
#Calcultating the sum in C
output_df.loc[index, 'C'] = df[df.D == row.D].C.sum()
#Finally changing values in D by concatenating values in B and D
output_df.loc[:, 'D'] = output_df.B.map(str) + "-" + output_df.D

Related

Create a dataframe based on an old one

I have a dataframe as :
A B C D
0 s 3 a
4 s 2 a
5 s 2 a
6 s 1 a
7 s 2 b
7 s 3 b
6 s 0 b
How can I create a new dataframe as the following?
A B C D
0 4 8 4-a
7 3 5 3-b
The new dataframe summarize the old one by grouped the elements of column "D", So "A" is the index, "B" is count of elements, "C" is sum of element where "D" has the same value.

Well, assuming that your data is stored in df, it's a multistep process which could be done like this
import pandas as pd
data = {'A': {0: 0, 1: 4, 2: 5, 3: 6, 4: 7, 5: 7, 6: 6},
'B': {0: 's', 1: 's', 2: 's', 3: 's', 4: 's', 5: 's', 6: 's'},
'C': {0: 3, 1: 2, 2: 2, 3: 1, 4: 2, 5: 3, 6: 0},
'D': {0: 'a', 1: 'a', 2: 'a', 3: 'a', 4: 'b', 5: 'b', 6: 'b'}}
df = pd.DataFrame(data)
# Handling column A (first index per value in D)
output_df = df.drop_duplicates(subset='D', keep='first')
# Itering through rows
for index, row in output_df.iterrows():
#Calcultating the counts in B
output_df.loc[index, 'B'] = df[df.D == row.D].B.count()
#Calcultating the sum in C
output_df.loc[index, 'C'] = df[df.D == row.D].C.sum()
#Finally changing values in D by concatenating values in B and D
output_df.loc[:, 'D'] = output_df.B.map(str) + "-" + output_df.D
Output :
A B C D
0 4 8 4-a
7 3 5 3-b

Merging to lists of different length to for list of dictionaries

Hi please help me develop a logic which does following.
list_1 = [1,2,3]
list_2 = [a,b,c,d,e,f,g,h,i]
Required output (List of dictionaries):
output = [{1:a,2:b,3:c}, {1:d,2:e,3:f}, {1:g,2:h,3:i}]
My script:
return_list = []
k = 0
temp_dict = {}
for i, value in enumerate(list_2):
if k <= len(list_1)-1:
temp_dict[list_1[k]] = value
if k == len(list_1)-1:
k = 0
print temp_dict
return_list.append(temp_dict)
print return_list
print '\n'
else:
k = k + 1
print return_list
My output:
{1: 'a', 2: 'b', 3: 'c'}
[{1: 'a', 2: 'b', 3: 'c'}]
{1: 'd', 2: 'e', 3: 'f'}
[{1: 'd', 2: 'e', 3: 'f'}, {1: 'd', 2: 'e', 3: 'f'}]
{1: 'g', 2: 'h', 3: 'i'}
[{1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}]
[{1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}]
As you can see, temp_dict is getting printed correctly, but return_list is the last temp_dict 3 times.
please help to solve.

The issue here is that you are not reseting temp_dict to a new object.
When you append it to the list, it still maintains the reference to the dict object, after you change it on the next loop, it changes the value on the array because it's the same reference.
If you reset the value it should work
list_1 = [1,2,3]
list_2 = ['a','b','c','d','e','f','g','h','i']
return_list = []
k = 0
temp_dict = {}
for i, value in enumerate(list_2):
if k <= len(list_1)-1:
temp_dict[list_1[k]] = value
if k == len(list_1)-1:
k = 0
print temp_dict
return_list.append(temp_dict)
temp_dict = {} # Here is the change
print return_list
print '\n'
else:
k = k + 1
print return_list

This should work:
list_1 = [1,2,3]
list_2 = ['a','b','c','d','e','f','g','h','i']
output = []
j = 0
for i in range(1, len(list_1) + 1):
output.append(dict(zip(list_1, list_2[j:i * 3])))
j = i * 3
print(output)
The assumption is that you second list is exactly 3 times larger than the first list.

def merge_them(list1, list2):
output = []
i = 0
while i < len(list_2):
output.append(dict(zip(list_1, list_2[i: i + len(list1)])))
i += len(list1)
return output
and you can test it:
test1:
list_1 = [1,2,3]
list_2 = ['a','b','c','d','e','f','g','h','i']
print merge_them(list_1, list_2)
you will get:
[{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]
test2:
list_1 = [1,2,3,]
list_2 = ['a','b','c','d','e']
print merge_them(list_1, list_2)
you will get:
[{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e'}]
test3:
list_1 = [1,2,3,4,5,6,7,8]
list_2 = ['a','b','c','d','e']
print merge_them(list_1, list_2)
you will get:
[{1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}]

Python merge two lists with different lengths

Hi please help me develop a logic which does following.
list_1 = [1,2,3]
list_2 = [a,b,c,d,e,f,g,h,i]
Required output (List of dictionaries):
output = [{1:a,2:b,3:c}, {1:d,2:e,3:f}, {1:g,2:h,3:i}]
My script:
return_list = []
k = 0
temp_dict = {}
for i, value in enumerate(list_2):
if k <= len(list_1)-1:
temp_dict[list_1[k]] = value
if k == len(list_1)-1:
k = 0
print temp_dict
return_list.append(temp_dict)
print return_list
print '\n'
else:
k = k + 1
print return_list
My output:
{1: 'a', 2: 'b', 3: 'c'}
[{1: 'a', 2: 'b', 3: 'c'}]
{1: 'd', 2: 'e', 3: 'f'}
[{1: 'd', 2: 'e', 3: 'f'}, {1: 'd', 2: 'e', 3: 'f'}]
{1: 'g', 2: 'h', 3: 'i'}
[{1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}]
[{1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}, {1: 'g', 2: 'h', 3: 'i'}]
As you can see, temp_dict is getting printed correctly, but return_list is the last temp_dict 3 times.
please help to solve.

return_list = []
k = 0
temp_dict = {}
for i, value in enumerate(list_2):
if k <= len(list_1)-1:
temp_dict[list_1[k]] = value
if k == len(list_1)-1:
k = 0
print temp_dict
return_list.append(temp_dict)
temp_dict = {}
else:
k = k + 1

For simple, you could use zip
list_1 = [1,2,3]
list_2 = ['a','b','c','d','e','f','g','h','i']
chunks = [list_2[idx:idx+3] for idx in range(0, len(list_2), 3)]
output = []
for each in chunks:
output.append(dict(zip(list_1, each)))
print(output)

from itertools import cycle
list_1 = [1, 2, 3]
list_2 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i']
def chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
zipped = tuple(zip(cycle(list_1), list_2))
out = list(map(dict, chunks(zipped, len(list_1))))
print(out)
Gives you:
[{1: 'a', 2: 'b', 3: 'c'}, {1: 'd', 2: 'e', 3: 'f'}, {1: 'g', 2: 'h', 3: 'i'}]

merge x python dictionaries into 1 with aggregated values

I have a list of dictionaries with the same key names, I want to consolidate the dictionaries into one dictionary with averaged values only in number-based values:
[{'a': 3, 'b': 'm', 'c': 7},
{'a': 1.0, 'b': 'm', 'c': 2},
{'a': 5, 'b': 'm', 'c': 4.0}]
into an averaged dictionary:
[{'a': 3, 'b': 'm', 'c': 4}]

If you can assume you have at least one dict in the list and all the dicts have all the keys you can do:
import numbers
dicts =[{'a': 3, 'b': 'm', 'c': 7},
{'a': 1.0, 'b': 'm', 'c': 2},
{'a': 5, 'b': 'm', 'c': 4.0}]
avg_dict = {}
for key in dicts[0]:
avg_dict[key] = sum([d[key] for d in dicts])/len(dicts) if isinstance(dicts[0][key], numbers.Number) else dicts[0][key]

Maybe not the most pythonic way, but it will do the job:
lst = [{'a': 3, 'b': 'm', 'c': 7},
{'a': 1.0, 'b': 'm', 'c': 2},
{'a': 5, 'b': 'm', 'c': 4.0}]
result = {}
for item in lst:
for j in item:
if type(item[j]) == str:
result[j] = item[j]
elif j in result:
result[j] += item[j]
else:
result[j] = item[j]
for i in result:
if type(result[i]) != str:
result[i] = int(result[i] / len(lst))
print(result)

How do I Pythonically print a list with formatting?

I have a list:
L = [1, 2, 3, 4, 5, 6]
and I want to print
1 B 2 J 3 C 4 A 5 J 6 X
from that list.
How do I do that?
Do I have to make another list and zip them up, or is there some way I can have the letters in my format specifier?

You could do it either way:
L = [1, 2, 3, 4, 5, 6]
from itertools import chain
# new method
print "{} B {} J {} C {} A {} J {} X".format(*L)
# old method
print "%s B %s J %s C %s A %s J %s X" % tuple(L)
# without string formatting
print ' '.join(chain.from_iterable(zip(map(str, L), 'BJCAJX')))
See the docs on str.format and string formatting.

A nice way to do this is have a dictionary of numbers to prefixes:
prefixes = {1: 'B', 2: 'J', 3: 'C', 4: 'A', 5: 'J', 6: 'X'}
Then you can do:
print ' 'join('%s %s' % (num, prefix) for num, prefix in prefixes.itervalues())
If you also have a list of letters:
nums = [1, 2, 3, 4, 5, 6]
ltrs = ['B', 'J', 'C', 'A', 'J', 'X']
print ' '.join('%s %s' % (num, ltr) for num, ltr in zip(nums, ltrs)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

df.info() doesn't show any information about the dataframe - python-2.7

Related

Create a dataframe based on an old one

Merging to lists of different length to for list of dictionaries

Python merge two lists with different lengths

merge x python dictionaries into 1 with aggregated values

How do I Pythonically print a list with formatting?

Categories

Resources