I need to interpolate two variables written in the 3D grid on another 3D grid. I tried the inverse distance method, but I get only two values that do not represent the distribution on the original grid, assigned to each point of the new grid. Here is an example of my code:
text=text[pstart:pend]
x=[]
y=[]
z=[]
for line in text:
coords=line.split()
x.append(float(coords[2])) #coordinates of the new grid
y.append(float(coords[1]))
z.append(float(coords[0]))
Xg=np.asarray([x,y,z])
# Gather mean flow data
xd=[]
yd=[]
zd=[]
cd=[]
rhod=[]
with open(meanflowdata,'rb') as csvfile:
spamreader=csv.reader(csvfile, delimiter=',')
for row in spamreader:
if len(row)>2:
xd.append(float(row[0])) #coordinates and values of the source file
yd.append(float(row[1]))
zd.append(float(row[2]))
cd.append(float(row[3]))
rhod.append(float(row[4]))
Xd=np.asarray([xd,yd,zd])
Zd=np.asarray([cd,rhod])
leafsize = 20
print "# setting up KDtree"
invdisttree = Invdisttree( Xd.T, Zd.T, leafsize=leafsize, stat=1 )
print "# Performing interpolation"
interpol = invdisttree( Xg.T )
c=interpol.T[0]
rho=interpol.T[1]
As far as I could check, the problem lies when I call the invdisttree function, which does not work properly. Does someone have an idea or an alternative method to suggest for the interpolation?
Where do interpol.T[0], interpol.T[1] come from,
where did your Invdisttree come from ?
This
on SO has
invdisttree = Invdisttree( X, z ) -- data points, values
interpol = invdisttree( q, nnear=3, eps=0, p=1, weights=None, stat=0 )
In your case X could be 100 x 3, z 100 x 2,
query points q 10 x 3 ⟶ interpol 10 x 2.
(invdisttree is a function, which you call to do the interpolation:
interpol = invdisttree( q ...) . Is that confusing ?)
The following seems obvious, yet it does not behave as I would expect. I want to do k-fold cross validation without using SCC packages, and thought I could just filter my data and run my own regressions on the subsets.
First I generate a variable with a random integer between 1 and 5 (5-fold cross validation), then I loop over each fold number. I want to filter the data by the fold number, but using a boolean filter fails to filter anything. Why?
Bonus: what would be the best way to capture all of the test MSEs and average them? In Python I would just make a list or a numpy array and take the average.
gen randint = floor((6-1)*runiform()+1)
recast int randint
forval b = 1(1)5 {
xtreg c.DepVar /// // training set
c.IndVar1 ///
c.IndVar2 ///
if randint !=`b' ///
, fe vce(cluster uuid)
xtreg c.DepVar /// // test set, needs to be performed with model above, not a
c.IndVar1 /// // new model...
c.IndVar2 ///
if randint ==`b' ///
, fe vce(cluster uuid)
}
EDIT: Test set needs to be performed with model fit to training set. I changed my comment in the code to reflect this.
Ultimately the solution to the filtering issue was I was using a scalar in quotes to define the bounds and I had:
replace randint = floor((`varscalar'-1)*runiform()+1)
instead of just
replace randint = floor((varscalar-1)*runiform()+1)
When and where to use the quotes in Stata is confusing to me. I cannot just use varscalar in a loop, I have to use `=varscalar', but I can for some reason use varscalar - 1 and get the expected result. Interestingly, I cannot use
replace randint = floor((`varscalar')*runiform()+1)
I suppose I should just use
replace randint = floor((`=varscalar')*runiform()+1)
So why is it ok to use the version with the minus one and without the equals sign??
The answer below is still extremely helpful and I learned much from it.
As a matter of fact, two different things are going on here that are not necessarily directly related. 1) How to filter data with a randomly generated integer value and 2) k-fold cross-validation procedure.
For the first one, I will leave an example below that could help you work things out using Stata with some tools that can be easily transferable to other problems (such as matrix generation and manipulation to store the metrics). However, I would call neither your sketch of code nor my example "k-fold cross-validation", mainly because they fit the model, both in the testing and in training data. Nonetheless, the case should be that strictly speaking, the model should be trained in the training data, and using those parameters, assess the performance of the model in testing data.
For further references on the procedure Scikit-learn has done brilliant work explaining it with several visualizations included.
That being said, here is something that could be helpful.
clear all
set seed 4
set obs 100
*Simulate model
gen x1 = rnormal()
gen x2 = rnormal()
gen y = 1 + 0.5 * x1 + 1.5 *x2 + rnormal()
gen byte randint = runiformint(1, 5)
tab randint
/*
randint | Freq. Percent Cum.
------------+-----------------------------------
1 | 17 17.00 17.00
2 | 18 18.00 35.00
3 | 21 21.00 56.00
4 | 19 19.00 75.00
5 | 25 25.00 100.00
------------+-----------------------------------
Total | 100 100.00
*/
// create a matrix to store results
matrix res = J(5,4,.)
matrix colnames res = "R2_fold" "MSE_fold" "R2_hold" "MSE_hold"
matrix rownames res ="1" "2" "3" "4" "5"
// show formated empty matrix
matrix li res
/*
res[5,4]
R2_fold MSE_fold R2_hold MSE_hold
1 . . . .
2 . . . .
3 . . . .
4 . . . .
5 . . . .
*/
// loop over different samples
forvalues b = 1/5 {
// run the model using fold == `b'
qui reg y x1 x2 if randint ==`b'
// save R squared training
matrix res[`b', 1] = e(r2)
// save rmse training
matrix res[`b', 2] = e(rmse)
// run the model using fold != `b'
qui reg y x1 x2 if randint !=`b'
// save R squared training (?)
matrix res[`b', 3] = e(r2)
// save rmse testing (?)
matrix res[`b', 4] = e(rmse)
}
// Show matrix with stored metrics
mat li res
/*
res[5,4]
R2_fold MSE_fold R2_hold MSE_hold
1 .50949187 1.2877728 .74155365 1.0070531
2 .89942838 .71776458 .66401888 1.089422
3 .75542004 1.0870525 .68884359 1.0517139
4 .68140328 1.1103964 .71990589 1.0329239
5 .68816084 1.0017175 .71229925 1.0596865
*/
// some matrix algebra workout to obtain the mean of the metrics
mat U = J(rowsof(res),1,1)
mat sum = U'*res
/* create vector of column (variable) means */
mat mean_res = sum/rowsof(res)
// show the average of the metrics acros the holds
mat li mean_res
/*
mean_res[1,4]
R2_fold MSE_fold R2_hold MSE_hold
c1 .70678088 1.0409408 .70532425 1.0481599
*/
i have a truss bridge that i want to draw using fortran 90 and Gnuplot , but the truss elements should change color if the corresponding array element is bigger or smaller than zero
open(10,file='plot_2D.plt')
write(10,*) 'set title " truss"'
write(10,*) 'set xrange [0:300]'
write(10,*) 'set yrange [0:40]'
write(10,*) 'set xlabel "x [U]"'
write(10,*) 'set ylabel "y [U]"'
write(10,*) 'set key noautotitle'
write(10,*) 'plot "1.txt" if (x(1) > 0 ) {with line 7} else { with line lt 3 } '
write(10,*) 'pause -1 "Hit return to continue"'
close(10)
ofcourse i have 1.txt up to 222.txt
but i just put one here because it is repetitive process
but i dont get any plot ,
what is my mistake ?
It is not clear what exactly you mean by x(1) > 0 since there is no function x() defined. Assuming that x(1) means data value in column 1, then instead of
write(10,*) 'plot "1.txt" if (x(1) > 0 ) {with line 7} else { with line lt 3 } '
the correct syntax is
write(10,*) 'plot "1.txt" using 1:2:($1>0 ? 7 : 3) with lines linecolor variable'
Does not display anything after data entry
program g
implicit none
real::q,n,s,z,q2,y,free_board,r,b,e,A,h,t
write(*,100)"pls insert discharge Q ="
read(*,*)q
write(*,100)"please insert Manning coefficient n ="
read(*,*)n
write(*,*)"please insert slope of the hydraulic channel ="
read(*,*)s
write(*,*)"please inset Z ="
read(*,*)z
write(*,*)"how much of b/y do you want?"
write(*,*)"if it not important right 2.5"
read(*,*)e
if(e<2.or.e>5)then
stop
end if
y=0
do
b=y*e
A=b+2*y*((1+Z**2)**(0.5))
R=((b+z*y)*y)/(b+(2*y*(1+z**2)**(0.5)))
h=(1/n)*(r**(2/3))*A*(s)**0.5
if( abs(h-q)<0.01) then
exit
end if
y=0.001+y
end do
free_board=0.2*y
h=free_board+y
t=b+2*y*z
write(*,100)"free board="
write(*,*) free_board
write(*,100)"y="
write(*,*)y
write(*,100)"b="
write(*,*)b
write(*,100)"T="
write(*,100)t
100 format(A)
end program g
this not work and not show anything after enter data
For sure the line write(*,100)t will give you a wrong output, since "t" it is a real, not a string. Please change it to write(*,*).
With all inputs equal to 1.0 and by putting e=2.5, I see these output (at screen):
free board = 3.7200041E-02
y = 0.1860002
b = 0.4650005
T = 0.8370009
If you don't see outputs, maybe you are choosing wrong "e" values (less than 2 or more than 5).
So I'm trying to come up with a clever way to make this program read a catalog and take anything falling within specific spatial "grid" boxes and average the data in that box together. I'll paste my horrid attempt below and hopefully you'll see what I'm trying to do. I can't get the program to work correctly (it gets stuck in a loop somewhere that I haven't debugged), and before I bang my head against it anymore I want to know if this looks like a logical set of operations for what I'm looking to do, or if there is a better way to accomplish this.
Edit: To clarify, the argument section is for the trimming parameters---"lmin lmax bmin bmax" set the overall frame, and "deg" sets the square-degree increments.
program redgrid
implicit none
! Variable declarations and settings:
integer :: ncrt, c, i, j, k, count, n, iarg, D, db, cn
real :: dsun, pma, pmd, epma, epmd, ra, dec, degbin
real :: V, Per, Amp, FeH, EBV, Dm, Fi, FeHav, EBVav
real :: lmin, lmax, bmin, bmax, l, b, deg, lbin, bbin
real :: bbinmax, bbinmin, lbinmax, lbinmin
character(len=60) :: infile, outfile, word, name
parameter(D=20000)
dimension :: EBV(D), FeH(D), lbinmax(D), bbinmax(D)
dimension :: bbinmin(D), lbinmin(D)
103 format(1x,i6,4x,f6.2,4x,f6.2,4x,f7.2,3x,f6.2,4x,f5.2,4x,f5.2,4x,f5.2,4x,f6.4)
3 continue
iarg=iargc()
if(iarg.lt.7) then
print*, 'Usage: redgrid infile outfile lmin lmax bmin bmax square_deg'
stop
endif
call getarg(1, infile)
call getarg(2, outfile)
call getarg(3, word)
read(word,*) lmin
call getarg(4, word)
read(word,*) lmax
call getarg(5, word)
read(word,*) bmin
call getarg(6, word)
read(word,*) bmax
call getarg(7, word)
read(word,*) deg
open(unit=1,file=infile,status='old',err=3)
open(unit=2,file=outfile,status='unknown')
write(2,*)"| l center | b center | [Fe/H] avg | E(B-V) avg | "
FeHav = 0.0
EBVav = 0.0
lbinmin(1) = lmin
bbinmin(1) = bmin
degbin = (bmax-bmin)/deg
db = NINT(degbin)
do j = 1, db
bbinmax(j) = bbinmin(j) + deg
lbinmax(j) = lbinmin(j)*cos(bbinmax(j))
print*, lbinmin(j), bbinmin(j), db
cn = 1
7 continue
read(1,*,err=7,end=8) ncrt, ra, dec, l, b,&
V, dsun, FeH(cn), EBV(cn)
if(b.ge.bbinmin(j).and.b.lt.bbinmax(j)) then
if(l.ge.lbinmin(j).and.l.lt.lbinmax(j)) then
FeHav = FeHav + FeH(cn)
EBVav = EBVav + EBV(cn)
cn = cn + 1
end if
end if
goto 7
8 continue
FeHav = FeHav/cn
EBVav = EBVav/cn
write(2,*) lbinmax(j), bbinmax(j), FeHav, EBVav
bbinmin(j+1) = bbinmin(j) + deg
lbinmin(j+1) = lbinmin(j) + deg
end do
close(1)
close(2)
end program redgrid
Below is a small section of the table I'm working with. "l" and "b" are the two coordinates I am working with---they are angular, hence the need to make the grid components "b" and "l*cos(b)." For each 0.5 x 0.5 degree section, I need to have averages of E(B-V) and [Fe/H] within that block. When I write the file all I need are four columns: the two coordinates where the box is located, and the two averages for that box.
| Ncrt | ra | dec | l | b | V | dkpc | [Fe/H] | E(B-V) |
7888 216.53 -43.85 -39.56 15.78 15.68 8.90 -1.19 0.1420
7889 217.49 -43.13 -38.61 16.18 16.15 10.67 -1.15 0.1750
7893 219.16 -43.26 -37.50 15.58 15.38 7.79 -1.40 0.1580
Right now, the program gets stuck somewhere in the loop cycle. I've pasted the terminal output that happens when I run it, along with the command line I'm running it with. Please let me know if I can help clarify. This is a pretty complex problem for a Fortran rookie such as myself---perhaps I'm missing some fundamental knowledge that would make it much easier. Anyways, thanks in advance.
./redgrid table2.above redtest.trim -40 0 15 30 0.5
-40.0000000 15.0000000 30 0.00000000 0.00000000
-39.5000000 15.5000000 30 -1.18592596 0.353437036
^it gets stuck after two lines.
I assume that the program does what you want it to do, but you are looking for a few things to tidy the code up.
Well first up, I'd fix up the indentation.
Secondly, I'd not use unit numbers below 10.
INTEGER, PARAMETER :: in_unit = 100
INTEGER, PARAMETER :: out_unit = 101
...
OPEN(unit=in_unit, file=infile, status='OLD")
...
READ(in_unit, *) ...
...
CLOSE(in_unit)
Thirdly, I'd not use GOTOs and labels. You can do that in a loop far easier:
INTEGER :: read_status
DO j = 1, db
...
read_loop : DO
READ(in_unit, *, IOSTAT=read_status) ...
IF (read_status == -1) THEN ! EOF
EXIT read_loop
ELSEIF (read_status /= 0) THEN
CYCLE read_loop
ENDIF
...
END DO read_loop
...
END DO
There are a few dangers in your code, and even in this one above: It can lead to infinite loops. For example, if the opening of infile fails (e.g. the file doesn't exist), it loops back to label 3, but nothing changes, so it will eventually again try to open the same file, and probably have the same error.
Same above: If READ repeatedly fails without advancing, and without the error being an EOF, then the read loop will not terminate.
You have to think about what you want your program to do when something like this happens, and code it in. (For example: Print an error message and STOP if it can't open the file.)
You have a very long FORMAT statement. You can leave it like that, though I'd probably try to shorten it a bit:
103 FORMAT(I7, 2F10.2, F11.2, 4F9.2, F10.4)
This should be the same line, as numbers are usually right-aligned. You can also use strings as a format, so you could also do something like this:
CHARACTER(LEN=*), PARAMETER :: data_out_form = &
'(I7, 2F10.2, F11.2, 4F9.2, F10.4)'
WRITE(*, data_out_form) var1, var2, var3, ...
and again, that's one less label.