I am using FORTRAN77 as a third party language on ANSYS computation software. Here we can write the entire row and columns to files during I/O operations. I am not able to however move the cursor to the first row and write column wise thereafter- for every column in the 2D array defined. It writes all the data in the single column unfortunately. I need to know what I can use at the place quoted as XXX
*CFOPEN, ACT_STR, CSV,,APPEND
*DO,INF,1,2*S,1
*VWRITE, S0(1,INF),
(XXX,F10.2,',')
*CFCLOS
You can try transpose of the matrix and then print the matrix row-wise. you can write a small subroutine that can do the transpose for SO.
Related
I have a data table that has this format :
and I want to plot temperature to time, any idea how to do that ?
This can be done in a TERR data function. I don't know how comfortable you are integrating Spotfire with TERR, there is an intro video here for instance (demo starts from about minute 7):
https://www.youtube.com/watch?v=ZtVltmmKWQs
With that in mind, I wrote the script without loading any library, so it is quite verbose and explicit, but hopefully simpler to follow step by step. I am sure there is a more elegant way, and there are better ways of making it flexible with column names, but this is a start.
Your input will be a data table (dt, the original data) and the output a new data table (dt.out, the transformed data). All column names (and some values) are addressed explicitly in the script (so if you change them it won't work).
#remove the []
dt$Values=gsub('\\[|\\]','',dt$Values)
#separate into two different data frames, one for time and one for temperature
dt.time=dt[dt$Description=='time',]
dt.temperature=dt[dt$Description=='temperature',]
#split the columns we want to separate into a list of vectors
dt2.time=strsplit(as.character(dt.time$Values),',')
dt2.temperature=strsplit(as.character(dt.temperature$Values),',')
#rearrange times
names(dt2.time)=dt.time$object
dt2.time=stack(dt2.time) #stack vectors
dt2.time$id=c(1:nrow(dt2.time)) #assign running id for merging later
colnames(dt2.time)[colnames(dt2.time)=='values']='time'
#rearrange temperatures
names(dt2.temperature)=dt.temperature$object
dt2.temperature=stack(dt2.temperature) #stack vectors
dt2.temperature$id=c(1:nrow(dt2.temperature)) #assign running id for merging later
colnames(dt2.temperature)[colnames(dt2.temperature)=='values']='temperature'
#merge time and temperature
dt.out=merge(dt2.time,dt2.temperature,by=c('id','ind'))
colnames(dt.out)[colnames(dt.out)=='ind']='object'
dt.out$time=as.numeric(dt.out$time)
dt.out$temperature=as.numeric(dt.out$temperature)
Gaia
because all of the example rows you've shown here contain exactly four list items and you haven't specified otherwise, I'll assume that all of the data fits this format.
with this assumption, it becomes pretty trivial, albeit a little messy, to split the values out into columns using the RXReplace() expression function.
you can create four calculated columns, each with an expression like:
Int(RXReplace([values],"\\[([\\d\\-]+),([\\d\\-]+),([\\d\\-]+),([\\d\\-]+)]","\\1",""))
the third argument "\\1" determines which number in the list to extract. backslashes are doubled ("escaped") per the requirements of the RXReplace() function.
note that this example assumes the numbers are all whole numbers. if you have decimals, you'd need to adjust each "phrase" of the regular expression to ([\\d\\-\\.]+), and you'd need to wrap the expression in Real() rather than Int() (if you leave this part out, the result will be a String type which could cause confusion later on when working with the data).
once you have the four columns, you'll be able to unpivot to get the data easily.
I am exporting a lot of strings from Stata to Excel.
For a single column of 3000+ rows with a different string in each, I need to check the length of each string/cell. I could do this in Stata using the length() function, but I need to be able to open the Excel file, edit a given string, and have the length update automatically in Excel.
This seems like it should be simple using the putexcel command or mata's put_formula() function, but the time to run is prohibitive.
At root, my question is about building many relative references (e.g., =LEN(A1)) in mata all at once, as opposed to one at a time.
This may make more sense after seeing the code below:
mata: b = xl()
mata: b.create_book("Formula_Test", "Formula_Test", "xlsx")
mata: b.load_book("Formula_Test")
*Put some strings in column 1
mata: b.put_string(1, 1, "asfas")
mata: b.put_string(2, 1, "sfhds")
mata: b.put_string(3, 1, "qwrq")
mata: b.put_string(4, 1, "dgsdgsdgsdgs")
*Formula - export one-at-a-time
*This works, but is slow
foreach i of numlist 1/4{
mata: b.put_formula(`i', 2, "LEN(A`i')")
}
*Formula - export all at once with relative reference
*This would be faster, but throws error
mata: b.put_formula((1,4), 3, "LEN(INDIRECT("C[-2]",FALSE))")
When I run the last line, I get an error:
invalid expression
r(3000);
Is there an efficient way to write an entire column or row of Excel formulas using mata, with relative references?
The mata function put_formula() only accepts scalars for rows and columns. Note that you also need to use compound double quotes in its string matrix argument.
Looping in mata is always faster than doing so in Stata:
mata:
for (i = 1; i <= 4; i++) {
b.put_formula(i, 2, `"LEN(INDIRECT("C[-2]",FALSE))"')
}
end
Nevertheless, despite the limitation of having to use scalars as arguments for rows and columns in put_formula(), a loop is in fact not necessary. This is because one can specify a string matrix J of constants as the final argument.
Indeed, the following does the same in seconds:
mata:
k = J(3000, 1, `"LEN(INDIRECT("C[-1]",FALSE))"')
b.put_formula(1, 2, k)
end
In this way, the matrix J[3000,1] is written once in cell B1 of the spreadsheet. Because it has 3000 rows, it naturally extends to all cells down to B3000.
This answer is secondary but may be useful to someone.
The inefficiency in the code in the question -- looping through a numlist and writing the formula in mata one cell at a time -- comes partially from the use of a Stata loop (as Pearly Spencer pointed out and corrected). But a bigger issue is the number of times mata has to write individual cells when the example is expanded from 4 cells to several thousand.
If you can avoid looping and writing many cells individually, using -putexcel- or mata's b.put_formula are not dramatically different in speed in most applications. If you are writing cells in a single column, row, or matrix of cells, and can write them all at once, either option will be fast. A -putexcel- example:
*A -putexcel- example
mata: b.create_book("Formula_Test", "Formula_Test", "xlsx")
putexcel set "Formula_Test", sheet("Formula_Test") modify
putexcel B1:B30000 = formula(`" =LEN(INDIRECT("C[-1]",FALSE)) "')
For 30,000 cells in a single column, -putexcel- took 37 seconds.
Using Pearly Spencer's J matrix approach in mata took 36 seconds.
The important point is: if you are writing a formula to many cells, try to consolidate it into blocks that can be written together as matrices, rather than looping over all cells. This will give you the biggest speed gains; using mata instead of -putexcel- will help, but will provide only a second-order improvement. Even in mata it will take a long time to write individually to thousands of cells.
I have a .CSV file that's storing data from a laser. It records the height of the laser beam every second.
The .CSV file ends up having rows for each measurement that are all in this format:
DR,04,#
where the # is the height reading.
For example, if the beam is at a height of 10, the reading would say:
DR,04,10.
I want my program in C++ to read only the height (third column of the .CSV) from each row and put it into an array. I do not want the first two columns at all. That way I end up with an array with just a bunch of height values from each measurement.
How do I do that?
You can use strtok() to separate out the three columns. And then just get the last value.
You could also just take the string and scan for the first comma, and then scan from there for the second comma. What follows is the value you are after.
You could also use sscanf() to parse out the individual values.
This really isn't a difficult problem, and there are many ways to approach it. That is why people are complaining that you probably should've tried something and then ask a question here when you get stuck on a specific question.
I have a CSV file that has about 10 different columns. Im trying to figure out whats the best method to go about here.
Data looks like this:
"20070906 1 0 0 NO"
Theres about 40,000 records like this to be analyzed. Im not sure whats best here, split each column into its own vector, or put each whole row into a vector.
Thanks!
I think this is kind of subjective question but imho I think that having a single vector that contains the split up rows will likely be easier to manage than separate vectors for each column. You could even create a row object that the vector stores to make accessing and processing the data in the rows/columns more friendly.
Although if you are only doing processing on a column level and not on a row or entry level having individual column vectors would be easier.
Since the data set is fairly small (assuming you are using a PC and not some other device, like a smartphone), you can read the file line by line into a vector of strings and then parse the elements one by one and populate a vector of some structures holding the records data.
So, I have this program that collects a bunch of interesting data. I want to have a library that I can use to sort this data into columns and rows (or similar), save it to a file, and then use some other program (like OpenOffice Spreadsheet, or MATLAB since I own it, or maybe some other spreadsheet/database grapher that I don't know of) to analyse and graph the data however I want. I prefer this library to be open source, but it's not really a requirement.
Ok so my mistake, you wanted a writer. Writing a CSV is simple and apparently reading them into matlab is simple too.
http://www.mathworks.com.au/help/techdoc/ref/csvread.html
A CSV has a simple structure. For each row you seperate by newline. and each column is seperated by a comma.
0,10,15,12
4,7,0,3
So all you really need to do is grab your data, seperate it by rows then write a line out with each column seperated by a comma.
If you need a code example I can edit again but this shouldn't be too difficult.