Loading numerical data from a text file in python - python-2.7

I have a text file which is
1.25e5 15
2.7e6 12
18.e5 14
I want to read the text as a 2d array and assign the first column as x and second as y.
Can anyone help me how can I do that. I did
f = open('energy.txt', 'r')
x = f.readlines()
but I don't know how to create the first column.

Since you're okay with numpy, you can just use np.loadtxt:
In [270]: np.loadtxt('energy.txt')
Out[270]:
array([[ 1.25000000e+05, 1.50000000e+01],
[ 2.70000000e+06, 1.20000000e+01],
[ 1.80000000e+06, 1.40000000e+01]])
Alternatively, the python way to do this is:
In [277]: data = []
In [278]: with open('energy.txt') as f:
...: for line in f:
...: i, j = line.split()
...: data.append([float(i), int(j)])
...:
In [279]: data
Out[279]: [[125000.0, 15], [2700000.0, 12], [1800000.0, 14]]
With this approach, you store data as a list of lists, not a numpy array of floats. Also, you'll need to add a try-except in case you have any deviant lines in your file.

Related

Exporting Dictionary to CSV

One of the stack overflow buddies was kind enough to give me a below code for creating a dictionary. This works well. But now I want to export the data frames in the dictionary into a single CSV file. Can someone please help me with this?
import pandas as pd
DF1 = pd.DataFrame({"A": [3], "B": [2], "C": [100]})
DF_list = {}
for i in ["A", "B"]:
DF = pd.DataFrame({})
DF[i] = DF1[[i]]
DF["C"] = DF1[["C"]]
DF["value"] = DF[i] * DF["C"]
DF_list["DF_" + i] = DF
print(DF_list)
{'DF_A': A C value
0 3 100 300, 'DF_B': B C value
0 2 100 200}

Can't merge two lists into a dictionary

I can't merge two lists into a dictionary.I tried the following :
Map two lists into a dictionary in Python
I tried all solutions and I still get an empty dictionary
from sklearn.feature_extraction import DictVectorizer
from itertools import izip
import itertools
text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')
diction = dict(itertools.izip(words,lines))
new_dict = {k: v for k, v in zip(words, lines)}
print new_dict
I get the following :
{'word': ''}
['word=']
The two lists are not empty.
I'm using python2.7
EDIT :
Output from the two lists (I'm only showing a few because it's a vector with 11k features)
//lines
['change', 'I/O', 'fcnet2', 'ifconfig',....
//words
['word', 'word', 'word', .....
EDIT :
Now at least I have some output #DamianLattenero
{'word\n': 'XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n'}
['word\n=XXAMSDB35:XXAMSDB35_NGCEAC_DAT_L_Drivei\n']
I think the root of a lot of confusion is code in the example that is not relevant.
Try this:
text_file = open("/home/vesko_/evnt_classification/bag_of_words", "r")
text_fiel2 = open("/home/vesko_/evnt_classification/sdas", "r")
lines = text_file.read().split('\n')
words = text_fiel2.read().split('\n')
# to remove any extra newline or whitespace from what was read in
map(lambda line: line.rstrip(), lines)
map(lambda word: word.rstrip(), words)
new_dict = dict(zip(words,lines))
print new_dict
Python builtin zip() returns an iterable of tuples from each of the arguments. Giving this iterable of tuples to the dict() object constructor creates a dictionary where each of the items in words is the key and items in lines is the corresponding value.
Also note that if the words file has more items than lines then there will either keys with empty values. If lines has items then only the last one will be added with an None key.
I tryed this and worked for me, I created two files, added numbers 1 to 4, letters a to d, and the code creates the dictionary ok, I didn't need to import itertools, actually there is an extra line not needed:
lines = [1,2,3,4]
words = ["a","b","c","d"]
diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
If that worked, and not the other, you must have a problem in loading the list, try loading like this:
def create_list_from_file(file):
with open(file, "r") as ins:
my_list = []
for line in ins:
my_list.append(line)
return my_list
lines = create_list_from_file("/home/vesko_/evnt_classification/bag_of_words")
words = create_list_from_file("/home/vesko_/evnt_classification/sdas")
diction = dict(zip(words,lines))
# new_dict = {k: v for k, v in zip(words, lines)}
print(diction)
Observation:
If you files.txt looks like this:
1
2
3
4
and
a
b
c
d
the result will have for keys in the dictionary, one per line:
{'a\n': '1\n', 'b\n': '2\n', 'c\n': '3\n', 'd': '4'}
But if you file looks like:
1 2 3 4
and
a b c d
the result will be {'a b c d': '1 2 3 4'}, only one value

Make a comma separated list of out of co-ordinates from a csv file

I have values x and y in a csv and i am reading those values and converting them into a numpy array using below code:
import numpy as np
import csv
data = np.loadtxt('datapoints.csv', delimiter=',')
# Putting data from csv file to variable
x = data[:, 0]
y = data[:, 1]
# Converting npArray to simple array
np.asarray(x)
np.asarray(y)
So, now i have the values of x and y.
But, i want them to be in this format:
[[x1,y1],[x2,y2], [x3,y3], ...... [xn,yn]]
How do i do that?
use zip :
result = [list(a) for a in zip(np.asarray(x),np.asarray(y))]

Convert a text file to an numpyarray

I am new to python. I have a .txt file
SET = 20,21,23,21,23
45,46,42,23,55
with many number of rows. How would I convert this txt file into an array ignoring spaces and commas? Any help would be really appreciated.
l1=[]
file = open('list-num')
for l in file:
l2 = map(int,l.split(','))
l1 = l1 + l2
print l1
Your data looks like :
SET 1 = 13900100,13900141,13900306,13900442,13900453,13900461, 13900524,13900537,13900619,13900632,13900638,13900661, 13900665,13900758,13900766,13900825,13900964,13901123, 13901131,13901136,13901141,13901143,13901195,13901218,
you can use the numpy command : np.genfromtxt ()
import numpy as np
import matplotlib.pyplot as plt
data = np.genfromtxt("text.txt", delimiter=",")
data = data[np.logical_not(np.isnan(data))] #Remove nan value
print data
I get :
[ 13900141. 13900306. 13900442. 13900453. 13900461. 13900524.
13900537. 13900619. 13900632. 13900638. 13900661. 13900665.
13900758. 13900766. 13900825. 13900964. 13901123. 13901131.
13901136. 13901141. 13901143. 13901195. 13901218.]
It should work ;)
------------------------------------
Other way :
import numpy as np
f = open("text.txt", "r") #Open data file
data = f.read() #Read data file
cut = data.split() #Split data file
value = cut[2] #Pick the value part
array = np.array(value) #Value becomes an array
print array
I get :
13900100,13900141,13900306,13900442,13900453,13900461,13900524,13900537,13900619,13900632,13900638,13900661,13900665,13900758,13900766,13900825,13900964,13901123,13901131,13901136,13901141,13901143,13901195,13901218

Python: Write two lists into two column text file

Say I have two lists:
a=[1,2,3]
b=[4,5,6]
I want to write them into a text file such that I obtain a two column text file:
1 4
2 5
3 6
Simply zip the list, and write them to a csv file with tab as the delimiter:
>>> a=[1,2,3]
>>> b=[4,5,6]
>>> zip(a,b)
[(1, 4), (2, 5), (3, 6)]
>>> import csv
>>> with open('text.csv', 'w') as f:
... writer = csv.writer(f, delimiter='\t')
... writer.writerows(zip(a,b))
...
>>> quit()
$ cat text.csv
1 4
2 5
3 6
You can use numpy.savetxt(), which is a convenient tool from the numpy library.
A minimal example would be as follows:
import numpy as np
xarray = np.array([0, 1, 2, 3, 4, 5])
yarray = np.array([0, 10, 20, 30, 40, 50])
# here is your data, in two numpy arrays
data = np.column_stack([xarray, yarray])
datafile_path = "/your/data/output/directory/datafile.txt"
np.savetxt(datafile_path , data, fmt=['%d','%d'])
# here the ascii file is written.
The fmt field in np.savetxt() in the example specifies that the numbers are integers.
You can use a different format for each column.
E.g. to specify floating point format, with 2 decimal digits and 10 characters wide columns, you would use '%10.2f'.
Try this:
file = open("list.txt", "w")
for index in range(len(a)):
file.write(str(a[index]) + " " + str(b[index]) + "\n")
file.close()
A simple solution is to write columns of fixed-width text:
a=[1,2,3]
b=[4,5,6]
col_format = "{:<5}" * 2 + "\n" # 2 left-justfied columns with 5 character width
with open("foo.csv", 'w') as of:
for x in zip(a, b):
of.write(col_format.format(*x))
Then cat foo.csv produces:
1 4
2 5
3 6
The output is both human and machine readable, whereas tabs can generate messy looking output if the precision of the values varies along the column. It also avoids loading the separate csv and numpy libraries, but works with both lists and arrays.
You can write two lists into a text file that contains two columns.
a=[1,2,3]
b=[4,5,6]
c = [a, b]
with open("list1.txt", "w") as file:
for x in zip(*c):
file.write("{0}\t{1}\n".format(*x))
Output in the text file:
1 4
2 5
3 6
It exits a straightway to save and stack same vectors length in columns. To do so use the concatenate function, you can then stack 3,4 or N vectors in columns delimitered by a tab.
np.savetxt('experimental_data_%s_%.1fa_%dp.txt'%(signal,angle,p_range), np.c_[DCS_exp, p_exp], delimiter="\t")