Save list of table of numbers from Python into format easily readable by Mathematica? - python-2.7

I am running a simulation in Python. The simulation's results are summarized in a list of number matrices. Is there a nice export format I can use to write this list, so that later I can read the file in Mathematica easily, and Mathematica will recognize it as a list of matrices automatically?

Well, it depends on how large your matrices are and whether speed or memory are a concern for you. The most simple solution is to create a plain-text Mathematica expression by yourself. Just iterate through your matrices and create a list of them in Mathematica formate. This boils down to writing braces and numbers in a file
{mat1, mat2, ...}
where mat1, etc are themselves lists of lists of numbers.
Update 1
If you want a standardized format, then you could look what you can easily import into Mathematica. One thing that hits the eye (after it was hit by MTX, which obviously doesn't work) is the MAT format. A quick search seems to indicate, that you can write those files with Python.
Update 2
Regarding your comment
Pythonica looks nice. Regrettably, I am running the Python simulations on a cluster that does not have Mathematica installed. I am using Mathematica in my personal PC for post-processing.
OK, but the package is not even 500 lines of code. Why don't you skim over it and just take out what you need: Code that transforms arbitrary Python lists to Mathematica code
_id_to_mathematica = lambda x: str(x)
def _float_to_mathematica(x):
return ("%e" % x).replace('e', '*10^')
def _complex_to_mathematica(z):
return 'Complex' + ('[%e,%e]' % (z.real, z.imag)).replace('e', '*10^')
def _str_to_mathematica(s):
return '\"%s\"' % s
def _iter_to_mathematica(xs):
s = '{'
for x in xs:
s += _python_mathematica[type(x)](x)
s += ','
s = s[:-1]
s += '}'
return s
_python_mathematica = {bool: _id_to_mathematica,
type(None): _id_to_mathematica,
int: _id_to_mathematica,
float: _float_to_mathematica,
long: _id_to_mathematica,
complex: _complex_to_mathematica,
iter: _iter_to_mathematica,
list: _iter_to_mathematica,
set: _iter_to_mathematica,
xrange: _iter_to_mathematica,
str: _str_to_mathematica,
tuple: _iter_to_mathematica,
frozenset: _iter_to_mathematica}
l = [[1, 2, 3], 1, [1, 5, [7, 3, 7, 8]]]
print(_iter_to_mathematica(l))
The output is a string
{{1,2,3},1,{1,5,{7,3,7,8}}}
that you can directly save into a file and load it into Mathematica using Get.

How big are the matrices?
If they are not too large, the JSON format will work well. I have used this, it is easy to work with both in Python and Mathematica.
If they are large, I would try HDF5. I have no experience with writing this from Python, but I know that it can store multiple datasets, thus it can store multiple matrices of different sizes.

Related

'DataFlowAnalysis' object has no attribute 'op_MAKE_FUNCTION' in Numba

I haven't seen this specific scenario in my research for this error in Numba. This is my first time using the package so it might be something obvious.
I have a function that calculates engineered features in a data set by adding, multiplying and/or dividing each column in a dataframe called data and I wanted to test whether numba would speed it up
#jit
def engineer_features(engineer_type,features,joined):
#choose which features to engineer (must be > 1)
engineered = features
if len(engineered) > 1:
if 'Square' in engineer_type:
sq = data[features].apply(np.square)
sq.columns = map(lambda s:s + '_^2',features)
for c1,c2 in combinations(engineered,2):
if 'Add' in engineer_type:
data['{0}+{1}'.format(c1,c2)] = data[c1] + data[c2]
if 'Multiply' in engineer_type:
data['{0}*{1}'.format(c1,c2)] = data[c1] * data[c2]
if 'Divide' in engineer_type:
data['{0}/{1}'.format(c1,c2)] = data[c1] / data[c2]
if 'Square' in engineer_type and len(sq) > 0:
data= pd.merge(data,sq,left_index=True,right_index=True)
return data
When I call it with lists of features, engineer_type and the dataset:
engineer_type = ['Square','Add','Multiply','Divide']
df = engineer_features(engineer_type,features,joined)
I get the error: Failed at object (analyzing bytecode)
'DataFlowAnalysis' object has no attribute 'op_MAKE_FUNCTION'
Same question here. I think the problem might be the lambda function since numba does not support function creation.
I had this same error. Numba doesnt support pandas. I converted important columns from my pandas df into bunch of arrays and it worked successfully under #JIT.
Also arrays are much faster then pandas df, incase you need it for processing large data.

Understanding; for i in range, x,y = [int(i) in i.... Python3

I am stuck trying to understand the mechanics behind this combined input(), loop & list-comprehension; from Codegaming's "MarsRover" puzzle. The sequence creates a 2D line, representing a cut-out of the topology in an area 6999 units wide (x-axis).
Understandably, my original question was put on hold, being to broad. I am trying to shorten and to narrow the question: I understand list comprehension basically, and I'm ok experienced with for-loops.
Like list comp:
land_y = [int(j) for j in range(k)]
if k = 5; land_y = [0, 1, 2, 3, 4]
For-loops:
for i in the range(4)
a = 2*i = 6
ab.append(a) = 0,2,4,6
But here, it just doesn't add up (in my head):
6999 points are created along the x-axis, from 6 points(x,y).
surface_n = int(input())
for i in range(surface_n):
land_x, land_y = [int(j) for j in input().split()]
I do not understand where "i" makes a difference.
I do not understand how the data "packaged" inside the input. I have split strings of integers on another task in almost exactly the same code, and I could easily create new lists and work with them - as I understood the structure I was unpacking (pretty simple being one datatype with one purpose).
The fact that this line follows within the "game"-while-loop confuses me more, as it updates dynamically as the state of the game changes.
x, y, h_speed, v_speed, fuel, rotate, power = [int(i) for i in input().split()]
Maybe someone could give an example of how this could be written in javascript, haskell or c#? No need to be syntax-correct, I'm just struggling with the concept here.
input() takes a line from the standard input. So it’s essentially reading some value into your program.
The way that code works, it makes very hard assumptions on the format of the input strings. To the point that it gets confusing (and difficult to verify).
Let’s take a look at this line first:
land_x, land_y = [int(j) for j in input().split()]
You said you already understand list comprehension, so this is essentially equal to this:
inputs = input().split()
result = []
for j in inputs:
results.append(int(j))
land_x, land_y = results
This is a combination of multiple things that happen here. input() reads a line of text into the program, split() separates that string into multiple parts, splitting it whenever a white space character appears. So a string 'foo bar' is split into ['foo', 'bar'].
Then, the list comprehension happens, which essentially just iterates over every item in that splitted input string and converts each item into an integer using int(j). So an input of '2 3' is first converted into ['2', '3'] (list of strings), and then converted into [2, 3] (list of ints).
Finally, the line land_x, land_y = results is evaluated. This is called iterable unpacking and essentially assumes that the iterable on the right has exactly as many items as there are variables on the left. If that’s the case then it’s just a nice way to write the following:
land_x = results[0]
land_y = results[1]
So basically, the whole list comprehension assumes that there is an input of two numbers separated by whitespace, it then splits those into separate strings, converts those into numbers and then assigns each number to a separate variable land_x and land_y.
Exactly the same thing happens again later with the following line:
x, y, h_speed, v_speed, fuel, rotate, power = [int(i) for i in input().split()]
It’s just that this time, it expects the input to have seven numbers instead of just two. But then it’s exactly the same.

how to apply cell style when using `append` in openpyxl?

I am using openpyxl to create an Excel worksheet. I want to apply styles when I insert the data. The trouble is that the append method takes a list of data and automatically inserts them to cells. I cannot seem to specify a font to apply to this operation.
I can go back and apply a style to individual cells after-the-fact, but this requires overhead to find out how many data points were in the list, and which row I am currently appending to. Is there an easier way?
This illustrative code shows what I would like to do:
def create_xlsx(self, header):
self.ft_base = Font(name='Calibri', size=10)
self.ft_bold = self.ft_base.copy(bold=True)
if header:
self.ws.append(header, font=ft_bold) # cannot apply style during append
ws.append() is designed for appending rows of data easily. It does, however, also allow you to include placeless cells within a row so that you can apply formatting while adding data. This is primarily of interest when using write_only=True but will work for normal workbooks.
Your code would look something like:
data = [1, 3, 4, 9, 10]
def styled_cells(data):
for c in data:
if c == 1:
c = Cell(ws, column="A", row=1, value=c)
c.font = Font(bold=True)
yield c
ws.append(styled_cells(data))
openpyxl will correct the coordinates of such cells.

How to smooth numbers from a file as many times as wanted in Python 2.7?

I'm trying to create a code that will open a file with a list of numbers in it and then take those numbers and smooth them as many times as the user wants. I have it opening and reading the file, but it will not transpose the numbers. In this format it gives this error: TypeError: unsupported operand type(s) for /: 'str' and 'float'. I also need to figure out how to make it transpose the numbers the amount of times the user asks it to. The list of numbers I used in my .txt file is [3, 8, 5, 7, 1].
Here is exactly what I am trying to get it to do:
Ask the user for a filename
Read all floating point data from file into a list
Ask the user how many smoothing passes to make
Display smoothed results with two decimal places
Use functions where appropriate
Algorithm:
Never change the first or last value
Compute new values for all other values by averaging the value with its two neighbors
Here is what I have so far:
filename = raw_input('What is the filename?: ')
inFile = open(filename)
data = inFile.read()
print data
data2 = data[:]
print data2
data2[1]=(data[0]+data[1]+data[2])/3.0
print data2
data2[2]=(data[1]+data[2]+data[3])/3.0
print data2
data2[3]=(data[2]+data[3]+data[4])/3.0
print data2
You almost certainly don't want to be manually indexing the list items. Instead, use a loop:
data2 = data[:]
for i in range(1, len(data)-1):
data2[i] = sum(data[i-1:i+2])/3.0
data = data2
You can then put that code inside another loop, so that you smooth repeatedly:
smooth_steps = int(raw_input("How many times do you want to smooth the data?"))
for _ in range(smooth_steps):
# code from above goes here
Note that my code above assumes that you have read numeric values into the data list. However, the code you've shown doesn't do this. You simply use data = inFile.read() which means data is a string. You need to actually parse your file in some way to get a list of numbers.
In your immediate example, where the file contains a Python formatted list literal, you could use eval (or ast.literal_eval if you wanted to be a bit safer). But if this data is going to be used by any other program, you'll probably want a more widely supported format, like CSV, JSON or YAML (all of which have parsers available in Python).

IO for julia reading fortran files

Noob question:
I have the output of a complex matrix done in Fortran, the contents looks like this:
(-0.594209719263636,1.463867815703586E-006)
(-0.783378034185788,-0.182301028756558) (-0.794024313844809,0.128219337674814)
(0.592814294881930,4.069892201461069E-002)
I want to read and use this data in a julia program.
No, I don't want to change the writting format, I would like to learn how to strip off
the "trash" characters like '(', or ','. This may be useful for arbitrary Input files.
2.I have tried with the following code:
file = open(pathtofilename, "r")
data_str = readall(ifile)
data_numbers_str = split(data_str)
data_numbers = split(data_numbers_str, ['('])
However, the manual is not quite self-explanatory [http://docs.julialang.org/en/release-0.2/stdlib/base/?highlight=split].
Here is what I'd do
data = "(-0.594209719263636,1.463867815703586E-006) (-0.783378034185788,-0.182301028756558) (-0.794024313844809,0.128219337674814) (0.592814294881930,4.069892201461069E-002)"
function pair_to_complex(pair)
nums = float(split(pair[2:end-1], ","))
return Complex(nums...)
end
numbers = map(pair_to_complex, split(data, " "))
To explain
The pair[2:end-1] removes the parenthesis
I then split that on the , to get an array with two numbers, still as strings
I convert them to Float64 with float(), obtaining an array of floats
I make a new complex number. The ... splats the array out so it provides the two arguments to Complex - I could have done Complex(nums[1],nums[2])
I then apply this logic using map to every term in the data.