read excel file with aspose version 2.5 - aspose

I want to read excel file with aspose version 2.5, I know that in the latest version we can do this like :
Workbook w = new Workbook(fileStream);
Cells cells = w.getWorksheets().get(0).getCells();
I didn't found any documentation on aspose 2.5.

Please see the following sample code and its sample console output. The code reads the source excel file and prints the names of cells A1 to A10 and its values.
Java
// Open the workbook with inputstream or filepath
Workbook wb = new Workbook();
wb.open("source.xlsx");
// Access first worksheet
Worksheet ws = wb.getWorksheets().getSheet(0);
// Read cells A1 to A10 and print their values
for (int i = 0; i < 10; i++) {
int row = i;
int col = 0;
// Access the cell by row and col indices
Cell cell = ws.getCells().getCell(row, col);
// Print cell name and its value
System.out.println(cell.getName() + ": " + cell.getStringValue());
}
Sample Console Output
A1: 13
A2: 16
A3: 87
A4: 73
A5: 69
A6: 91
A7: 59
A8: 46
A9: 82
A10: 54
Note: I am working as Developer Evangelist at Aspose

Related

Google Sheets Apps Script - How to add an Arrayformula and multiple associated IF functions within a script (Without showing the formula within UI)

I was wondering if someone is able to assist?
I'm trying to add an Arrayformula consisting of two IF functions, so I'm wanting to merge the following two formulas into one cell:
=ARRAYFORMULA(IF(D13:D104="","",(IF(K13:K104,K13:K104*20,"$0"))))
=ARRAYFORMULA(IF(D105:D="","",(IF(K105:K,K105:K*C4,"$0"))))
So the first section of the sheet needs to be multiplied by 20, and then the figure has changed and needs to be multiplied by 25 (which is cell C4)
Is it possible to merge these into one cell containing an Arrayformula+the two IF functions (or is there another/easier way for this)
Is it possible to add this into Google Apps Script so that it works in the backend (so not just a script that applies the formula into a cell - but doesn't show in the frontend or sheet)
More of a general question - When using Arrayformula with IF; and for example the output is specific text e.g. "Test Complete" associated to the range F2:F (checking if E2:E contains a particular phrase e.g. "Done") - for the empty cells in between (due to setting the False outcome as "") is it possible to somehow randomly add data into these blank cells without interrupting the formula? (so I have to option for automated text if the cell to the left states a particular term/word, but still have the option to manually add random data into blank cells)
Any assistance would be greatly appreciated
As for 1st and 2nd questions: it looks like a task for a custom function. Something like this:
function MULTI() {
var sheet = SpreadsheetApp.getActiveSheet();
var cell = sheet.getActiveCell();
var row = cell.getRow();
var value = sheet.getRange('K'+row).getValue();
return (row < 105) ? value * 20 : value * 25;
}
It gets a value from column K and multiplies it by 20 if the row less than 105 and by 25 for the rest of rows.
Here is the variant of the same formula that uses the cell 'C4':
function MULTIC4() {
var sheet = SpreadsheetApp.getActiveSheet();
var cell = sheet.getActiveCell();
var row = cell.getRow();
var value = sheet.getRange('K'+row).getValue();
var c4 = sheet.getRange('C4').getValue();
return (row < 105) ? value * 20 : value * c4;
}
And it can be done with the trigger onEdit():
function onEdit(e) {
var col = e.range.columnStart;
if (col != 11) return; // 11 = K
var sheet = e.source.getActiveSheet();
if (sheet.getName() != 'Sheet1') return;
var c4 = sheet.getRange('C4').getValue();
var row = e.range.rowStart;
var dest_cell = sheet.getRange('D'+row);
var value = sheet.getRange(row,col).getValue();
var result = (row < 105) ? value * 20 : value * c4;
dest_cell.setValue(result);
}
It recalculates automatically the value in the cell of column 'D' (current row) every time when you're changing value in the cell of column 'K'. On the sheet 'Sheet1'.

How To Interpret Least Square Means and Standard Error

I am trying to understand the results I got for a fake dataset. I have two independent variables, hours, type and response pain.
First question: How was 82.46721 calculated as the lsmeans for the first type?
Second question: Why is the standard error exactly the same (8.24003) for both types?
Third question: Why is the degrees of freedom 3 for both types?
data = data.frame(
type = c("A", "A", "A", "B", "B", "B"),
hours = c(60,72,61, 54,68,66),
# pain = c(85,95,69, 73, 29, 30)
pain = c(85,95,69, 85,95,69)
)
model = lm(pain ~ hours + type, data = data)
lsmeans(model, c("type", "hours"))
> data
type hours pain
1 A 60 85
2 A 72 95
3 A 61 69
4 B 54 85
5 B 68 95
6 B 66 69
> lsmeans(model, c("type", "hours"))
type hours lsmean SE df lower.CL upper.CL
A 63.5 82.46721 8.24003 3 56.24376 108.6907
B 63.5 83.53279 8.24003 3 57.30933 109.7562
Try this:
newdat <- data.frame(type = c("A", "B"), hours = c(63.5, 63.5))
predict(model, newdata = newdat)
An important thing to note here is that your model has hours as a continuous predictor, not a factor.

Find sum of the column values based on some other column

I have a input file like this:
j,z,b,bsy,afj,upz,343,13,ruhwd
u,i,a,dvp,ibt,dxv,154,00,adsif
t,a,a,jqj,dtd,yxq,540,49,kxthz
j,z,b,bsy,afj,upz,343,13,ruhwd
u,i,a,dvp,ibt,dxv,154,00,adsif
t,a,a,jqj,dtd,yxq,540,49,kxthz
c,u,g,nfk,ekh,trc,085,83,xppnl
For every unique value of Column1, I need to find out the sum of column7
Similarly, for every unique value of Column2, I need to find out the sum of column7
Output for 1 should be like:
j,686
u,308
t,98
c,83
Output for 2 should be like:
z,686
i,308
a,98
u,83
I am fairly new in Python. How can I achieve the above?
This could be done using Python's Counter and csv library as follows:
from collections import Counter
import csv
c1 = Counter()
c2 = Counter()
with open('input.csv') as f_input:
for cols in csv.reader(f_input):
col7 = int(cols[6])
c1[cols[0]] += col7
c2[cols[1]] += col7
print "Column 1"
for value, count in c1.iteritems():
print '{},{}'.format(value, count)
print "\nColumn 2"
for value, count in c2.iteritems():
print '{},{}'.format(value, count)
Giving you the following output:
Column 1
c,85
j,686
u,308
t,1080
Column 2
i,308
a,1080
z,686
u,85
A Counter is a type of Python dictionary that is useful for counting items automatically. c1 holds all of the column 1 entries and c2 holds all of the column 2 entries. Note, Python numbers lists starting from 0, so the first entry in a list is [0].
The csv library loads each line of the file into a list, with each entry in the list representing a different column. The code takes column 7 (i.e. cols[6]) and converts it into an integer, as all columns are held as strings. It is then added to the counter using either the column 1 or 2 value as the key. The result is two dictionaries holding the totaled counts for each key.
You can use pandas:
df = pd.read_csv('my_file.csv', header=None)
print(df.groupby(0)[6].sum())
print(df.groupby(1)[6].sum())
Output:
0
c 85
j 686
t 1080
u 308
Name: 6, dtype: int64
1
a 1080
i 308
u 85
z 686
Name: 6, dtype: int64
The data frame should look like this:
print(df.head())
Output:
0 1 2 3 4 5 6 7 8
0 j z b bsy afj upz 343 13 ruhwd
1 u i a dvp ibt dxv 154 0 adsif
2 t a a jqj dtd yxq 540 49 kxthz
3 j z b bsy afj upz 343 13 ruhwd
4 u i a dvp ibt dxv 154 0 adsif
You can also use your own names for the columns. Like c1, c2, ... c9:
df = pd.read_csv('my_file.csv', index_col=False, names=['c' + str(x) for x in range(1, 10)])
print(df)
Output:
c1 c2 c3 c4 c5 c6 c7 c8 c9
0 j z b bsy afj upz 343 13 ruhwd
1 u i a dvp ibt dxv 154 0 adsif
2 t a a jqj dtd yxq 540 49 kxthz
3 j z b bsy afj upz 343 13 ruhwd
4 u i a dvp ibt dxv 154 0 adsif
5 t a a jqj dtd yxq 540 49 kxthz
6 c u g nfk ekh trc 85 83 xppnl
Now, group by column 1 c1 or column c2 and sum up column 7 c7:
print(df.groupby(['c1'])['c7'].sum())
print(df.groupby(['c2'])['c7'].sum())
Output:
c1
c 85
j 686
t 1080
u 308
Name: c7, dtype: int64
c2
a 1080
i 308
u 85
z 686
Name: c7, dtype: int64
SO isn't supposed to be a code writing service, but I had a few minutes. :) Without Pandas you can do it with the CSV module;
import csv
def sum_to(results, key, add_value):
if key not in results:
results[key] = 0
results[key] += int(add_value)
column1_results = {}
column2_results = {}
with open("input.csv", 'rt') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
sum_to(column1_results, row[0], row[6])
sum_to(column2_results, row[1], row[6])
print column1_results
print column2_results
Results:
{'c': 85, 'j': 686, 'u': 308, 't': 1080}
{'i': 308, 'a': 1080, 'z': 686, 'u': 85}
Your expected results don't seem to match the math that Mike's answer and mine got using your spec. I'd double check that.

match elements from two files, how to write the intended format to a new file

I am trying to update my text file by matching the first column to another updated file's first column, after match it, it will update the old file.
Here is my oldfile:
Name Chr Pos ind1 in2 in3 ind4
foot 1 5 aa bb cc
ford 3 9 bb cc 00
fake 3 13 dd ee ff
fool 1 5 ee ff gg
fork 1 3 ff gg ee
Here is the newfile:
Name Chr Pos
foot 1 5
fool 2 5
fork 2 6
ford 3 9
fake 3 13
The updated file will be like:
Name Chr Pos ind1 in2 in3 ind4
foot 1 5 aa bb cc
fool 2 5 ee ff gg
fork 2 6 ff gg ee
ford 3 9 bb cc 00
fake 3 13 dd ee ff
Here is my code:
#!/usr/bin/env python
import sys
inputfile_1 = sys.argv[1]
inputfile_2 = sys.argv[2]
outputfile = sys.argv[3]
inputfile1 = open(inputfile_1, 'r')
inputfile2 = open(inputfile_2, 'r')
outputfile = open(outputfile, 'w')
ind = inputfile1.readlines()
cm = inputfile2.readlines()[1:]
outputfile.write(ind[0]) #add header
for i in ind:
i = i.split()
for j in cm:
j = j.split()
if j[0] == i[0]:
outputfile.writelines(j[0:3] + i[3:])
outputfile.write('\n')
inputfile1.close()
inputfile2.close()
outputfile.close()
When I ran it, ./compare_substitute_2files.py oldfile newfile output
the values were updated for the file, but they did not follow the order of the new file, and no space was there as indicated in the output below.
Name Chr Pos ind1 in2 in3 ind4
foot15aabbcc
ford39bbcc00
fake313ddeeff
fool25eeffgg
fork26ffggee
My question is how to match to the exact order and give spaces to each element in the list when write them out? Thanks!
file.write accepts string as its parameter.
If you want write sequences of strings instead of string, use file.writelines method instead:
outputfile.writelines(j[0:2] + i[3:])

How to read Fortran fixed-width formatted text file in Python?

I have a Fortran formatted text file (here is 3 first rows):
00033+3251 A B C? 6.96 5.480" 358 9.12 F0V 0.00 2.28s 1.00: 2MASS, dJ=1.3
00033+3251 Aa Ab Aab S1,E 0.62 0.273m 0 9.28 F0V 11.28 K2 1.68* 0.32* SB 1469
00033+3251 Aab Ac A E* 4.26 0.076" 0 9.12 F0V 0.00 2.00s 0.28* 2008MNRAS.383.1506
and the file format description:
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 10 A10 --- WDS WDS(J2000)
12- 14 A3 --- Primary Designation of the primary
16- 18 A3 --- Secondary Designation of the secondary component
20- 22 A3 --- Parent Designation of the parent (1)
24- 29 A6 --- Type Observing technique/status (2)
31- 35 F5.2 d logP ? Logarithm (10) of period in days
37- 44 F8.3 --- Sep Separation or axis
45 A1 --- x_Sep ['"m] Units of sep. (',",m)
47- 49 I3 deg PA Position angle
51- 55 F5.2 mag Vmag1 V-magnitude of the primary
57- 61 A5 --- SP1 Spectral type of the primary
63- 67 F5.2 mag Vmag2 V-magnitude of the secondary
69- 73 A5 --- SP2 Spectral type of the secondary
75- 79 F5.2 solMass Mass1 Mass of the primary
80 A1 --- MCode1 Mass estimation code for primary (3)
82- 86 F5.2 solMass Mass2 Mass of the secondary
87 A1 --- MCode2 Mass estimation code for secondary (3)
89-108 A20 --- Rem Remark
How to read my file in Python. I have found only read_fwf function from the pandas library.
import pandas as pd
filename = 'systems'
columns = ((0,10),(11,14),(15,18),(19,22),(23,29),(30,35),(36,44),(45,45),(46,49),(50,55),(56,61),(62,67),(68,73),(74,79),(80,80),(81,86),(87,87),(88,108))
data = pd.read_fwf(filename, colspecs = columns, header=None)
Is this the only possible and effective way? I hope I can do this without pandas. Have you any suggestions?
columns = ((0,10),(11,14),(15,18),(19,22),(23,29),(30,35),
(36,44),(44,45),(46,49),(50,55),(56,61),(62,67),
(68,73),(74,79),(79,80),(81,86),(86,87),(88,108))
string=file.readline()
dataline = [ string[c[0]:c[1]] for c in columns ]
note the column indices are (startbyte-1,endbyte) so that a single character field is
eg: (44,45)
this leaves you with a list of strings. You probably want to do conversion to floats, integers, etc. There are a number of questions here on that topic..
There is a module FortranRecordReader but it is weak with the stars, comments, etc that modern fortran files contain. Still, for a nice file, it is useful, in combination with namedtuple. Example:
from fortranformat import FortranRecordReader
fline=FortranRecordReader('(a1,i3,i5,i5,i5,1x,a3,a4,1x,f13.5,f11.5,f11.3,f9.3,1x,a2,f11.3,f9.3,1x,i3,1x,f12.5,f11.5)')
from collections import namedtuple
record=namedtuple('nucleo','cc NZ N Z A el o massexcess uncmassex binding uncbind B beta uncbeta am_int am_float uncatmass')
f=open('AME2012.mas12.ff','r')
for line in f:
nucl=record._make(fline.read(line))
You can try also the module "parse", or write yours
This type of file can be read with astropy tables. The header you show looks a lot like a CDS-formatted ascii table, which has a specific reader implemented for it:
http://astropy.readthedocs.org/en/latest/api/astropy.io.ascii.Cds.html#astropy.io.ascii.Cds
Expanding on arivero's answer, you could use fortranformat from pypi - here is what I would try ...
from fortranformat import FortranRecordReader
fmt = FortranRecordReader('(A10,A3,A3,A3,A6,F5.2,F8.3,A1,I3,F5.2,A5,F5.2,A5,F5.2,A1,F5.2,A1,A20)')
with fh as open('myfile.txt', 'r'):
for line in fh:
line_vals = fmt.read(line)
This should convert the values appropriately to numbers, bool etc.