Get Mbase (MVA) of Machine from PSSE for Python code - python-2.7

Please help me to get Mbase (MVA) of Machine from PSSE for Python.
I would like to use this to calculate with inertia (H) value.
Although I can get H as syntax below, I don't know how to get Mbase (MVA).
ierr = psspy.rwdy(option1=2,option2=0,out=0,ofile="C:\Program Files (x86)\PTI\PSSE34\EXAMPLE\python_test1.out")
21421 'GENROU' 1 11.000 0.47000E-01 0.67000 0.50000E-01 6.2300 0.0000 2.1000 1.5500 0.21000 0.40000 0.16000 0.13000 0.36100 0.69300
Thanks!

You can use psspy.macdat() as follows to store the value in a new variable mbase:
ierr, mbase = psspy.macdat(
ibus=bus_number, # bus number where machine is connected as an `int` object
id=id_, # machine ID as a `str` object
string='MBASE',
)
You will of course need to have already defined bus_number and id_.
You may see the other options by reading the docstring:
import psse34
import psspy
help(psspy.macdat)

Thank you for your support.
Based on your introduction, I can get Mbase.
Due to a lot of machines, so I have to use for loop in order to get all of the power data of machines. Please give me some advice whether there is any other way than to use a for loop.
Please see the picture result on Notepad++
enter image description here
That's so kind of you.
import psse34
import psspy
# Last case:
CASE = r"C:\Program Files (x86)\PTI\PSSE34\EXAMPLE\savnw.sav"
psspy.psseinit(12000)
psspy.case(CASE)
ierr = psspy.dyre_add(dyrefile="C:\Program Files (x86)\PTI\PSSE34\EXAMPLE\savnw.dyr")
ierr = psspy.rstr("C:\Program Files (x86)\PTI\PSSE34\EXAMPLE\savnw.snp")
ierr = psspy.rwdy(1,1,ofile="C:\Program Files (x86)\PTI\PSSE34\EXAMPLE\python_test1.out") # find inertia (H) of machine
machine1=[101,102,206,211,3011,3018]
for x in machine1:
ierr, mbase = psspy.macdat(ibus=x, id='1', string='MBASE') # find power of machine
print(mbase)

Related

How to get optimality gap using CPLEX Python API?

How can the optimality gap (relative and absolute) be obtained using CPLEX Python API (CPLEX 12.5, Python 2.7.15)? Is there any function such as get_optimal_gap() that would give the optimality gap? Or is parsing the output (as mentioned here) the only way? I see that there are custom functions such as solution.get_objective_value() - it would be nice if someone can suggest online resources that have a list of all functions that can be applied on the cplex object/file. Currently I am using the following code (courtesy: this IBM document):
import cplex
import sys
def sample1(filename):
c = cplex.Cplex(filename)
try:
c.solve()
except CplexSolverError:
print "Exception raised during solve"
return
# solution.get_status() returns an integer code
status = c.solution.get_status()
print "Solution status = " , status, ":",
print c.solution.status[status]
print "Objective value = " , c.solution.get_objective_value()
sample1(filename)
The documentation for the CPLEX Python API is here (currently version 12.8).
To get the relative mip gap, you can use c.solution.MIP.get_mip_relative_gap(). To calculate the absolute mip gap you'll need c.solution.MIP.get_best_objective().
If you haven't already, you'll also want to take a look at the CPX_PARAM_EPGAP and CPX_PARAM_EPAGAP parameters.

ODEINT with multiple parameters (time-dependent)

I'm trying to solve a single first-order ODE using ODEINT. Following is the code. I expect to get 3 values of y for 3 time-points. The issue I'm struggling with is ability to pass nth value of mt and nt to calculate dydt. I think the ODEINT passes all 3 values of mt and nt, instead just 0th, 1st or 2nd, depending on the iteration. Because of this, I get this error:
RuntimeError: The size of the array returned by func (4) does not match the size of y0 (1).
Interestingly, if I replace the initial condition, which is (and should be) a single value as: a0= [2]*4, the code works, but gives me a 4X4 matrix as solution, which seems incorrect.
mt = np.array([3,7,4,2]) # Array of constants
nt = np.array([5,1,9,3]) # Array of constants
c1,c2,c3 = [-0.3,1.4,-0.5] # co-efficients
para = [mt,nt] # Packing parameters
#Test ODE function
def test (y,t,extra):
m,n = extra
dydt = c1*c2*m - c1*y - c3*n
return dydt
a0= [2] # Initial Condition
tspan = range(len(mt)) # Define tspan
#Solving the ODE
yt= odeint(test, a0,tspan,args=(para,))
#Plotting the ODE
plt.plot(tspan,yt,'g')
plt.title('Multiple Parameters Test')
plt.xlabel('Time')
plt.ylabel('Magnitude')
The first order differential equation is:
dy/dt = c1*(c2*mt-y(t)) - c3*nt
This equation represents a part of murine endocrine system, which I am trying to model. The system is analogous to a two-tank system, where the first tank receives a specific hormone [at an unknown rate] but our sensor will detect that level (mt) at specific time intervals (1 second). This tank then feeds into the second tank, where the level of this hormone (y) is detected by another sensor. I labeled the levels using separate variables because the sensors that detect the levels are independent of each other and are not calibrated to each other. 'c2' may be considered as the co-efficient that shows the correlation between the two levels. Also, the transfer of this hormone from tank 1 to tank 2 is diffusion-driven. This hormone is further consumed by a biochemical process (similar to a drain valve for the second tank). At the moment, it is unclear which parameters affect the consumption; however, another sensor can detect the amount of hormone (nt) being consumed at a specific time interval (1 second, in this case too).
Thus, mt and nt are the concentrations/levels of the hormone at specific time points. although only 4-element in length in the code, these arrays are much longer in my study. All sensors report the concentrations at 1 second interval - hence tspan consists of time points separated by 1 second.
The objective is to determine the concentration of this hormone in the second tank (y) mathematically and then optimize the values of these coefficients based on the experimental data. I was able to pass these arrays mt and nt to the defined ODE and solve using ODE45 in MATLAB with no issue. I've been running into this RunTimeError, while trying to replicate the code in Python.
As I mentioned in a comment, if you want to model this system using an ordinary differential equation, you have to make an assumption about the values of m and n between sample times. One possible model is to use linear interpolation. Here's a script that uses scipy.interpolate.interp1d to create the functions mfunc(t) and nfunc(t) based on the samples mt and nt.
import numpy as np
from scipy.integrate import odeint
from scipy.interpolate import interp1d
import matplotlib.pyplot as plt
mt = np.array([3,7,4,2]) # Array of constants
nt = np.array([5,1,9,3]) # Array of constants
c1, c2, c3 = [-0.3, 1.4, -0.5] # co-efficients
# Create linear interpolators for m(t) and n(t).
sample_times = np.arange(len(mt))
mfunc = interp1d(sample_times, mt, bounds_error=False, fill_value="extrapolate")
nfunc = interp1d(sample_times, nt, bounds_error=False, fill_value="extrapolate")
# Test ODE function
def test (y, t):
dydt = c1*c2*mfunc(t) - c1*y - c3*nfunc(t)
return dydt
a0 = [2] # Initial Condition
tspan = np.linspace(0, sample_times.max(), 8*len(sample_times)+1)
#tspan = sample_times
# Solving the ODE
yt = odeint(test, a0, tspan)
# Plotting the ODE
plt.plot(tspan, yt, 'g')
plt.title('Multiple Parameters Test')
plt.xlabel('Time')
plt.ylabel('Magnitude')
plt.show()
Here is the plot created by the script:
Note that instead of generating the solution only at sample_times (i.e. at times 0, 1, 2, and 3), I set tspan to a denser set of points. This shows the behavior of the model between sample times.

How to create a container in pymc3

I am trying to build a model for the likelihood function of a particular outcome of a Langevin equation (Brownian particle in a harmonic potential):
Here is my model in pymc2 that seems to work:
https://github.com/hstrey/BayesianAnalysis/blob/master/Langevin%20simulation.ipynb
#define the model/function to be fitted.
def model(x):
t = pm.Uniform('t', 0.1, 20, value=2.0)
A = pm.Uniform('A', 0.1, 10, value=1.0)
#pm.deterministic(plot=False)
def S(t=t):
return 1-np.exp(-4*delta_t/t)
#pm.deterministic(plot=False)
def s(t=t):
return np.exp(-2*delta_t/t)
path = np.empty(N, dtype=object)
path[0]=pm.Normal('path_0',mu=0, tau=1/A, value=x[0], observed=True)
for i in range(1,N):
path[i] = pm.Normal('path_%i' % i,
mu=path[i-1]*s,
tau=1/A/S,
value=x[i],
observed=True)
return locals()
mcmc = pm.MCMC( model(x) )
mcmc.sample( 20000, 2000, 10 )
The basic idea is that each point depends on the previous point in the chain (Markov chain). Btw, x is an array of data, N is its length, delta_t is the time step =0.01. Any idea how to implement this in pymc3? I tried:
# define the model/function for diffusion in a harmonic potential
DHP_model = pm.Model()
with DHP_model:
t = pm.Uniform('t', 0.1, 20)
A = pm.Uniform('A', 0.1, 10)
S=1-pm.exp(-4*delta_t/t)
s=pm.exp(-2*delta_t/t)
path = np.empty(N, dtype=object)
path[0]=pm.Normal('path_0',mu=0, tau=1/A, observed=x[0])
for i in range(1,N):
path[i] = pm.Normal('path_%i' % i,
mu=path[i-1]*s,
tau=1/A/S,
observed=x[i])
Unfortunately the model crashes as soon as I try to run it. I tried some pymc3 examples (tutorial) on my machine and this is working.
Thanks in advance. I am really hoping that the new samplers in pymc3 will help me with this model. I am trying to apply Bayesian methods to single-molecule experiments.
Rather than creating many individual normally-distributed 1-D variables in a loop, you can make a custom distribution (by extending Continuous) that knows the formula for computing the log likelihood of your entire path. You can bootstrap this likelihood formula off of the Normal likelihood formula that pymc3 already knows. See the built-in AR1 class for an example.
Since your particle follows the Markov property, your likelihood looks like
import theano.tensor as T
def logp(path):
now = path[1:]
prev = path[:-1]
loglik_first = pm.Normal.dist(mu=0., tau=1./A).logp(path[0])
loglik_rest = T.sum(pm.Normal.dist(mu=prev*ss, tau=1./A/S).logp(now))
loglik_final = loglik_first + loglik_rest
return loglik_final
I'm guessing that you want to draw a value for ss at every time step, in which case you should make sure to specify ss = pm.exp(..., shape=len(x)-1), so that prev*ss in the block above gets interpreted as element-wise multiplication.
Then you can just specify your observations with
path = MyLangevin('path', ..., observed=x)
This should run much faster.
Since I did not see an answer to my question, let me answer it myself. I came up with the following solution:
# now lets model this data using pymc
# define the model/function for diffusion in a harmonic potential
DHP_model = pm.Model()
with DHP_model:
D = pm.Gamma('D',mu=mu_D,sd=sd_D)
A = pm.Gamma('A',mu=mu_A,sd=sd_A)
S=1.0-pm.exp(-2.0*delta_t*D/A)
ss=pm.exp(-delta_t*D/A)
path=pm.Normal('path_0',mu=0.0, tau=1/A, observed=x[0])
for i in range(1,N):
path = pm.Normal('path_%i' % i,
mu=path*ss,
tau=1.0/A/S,
observed=x[i])
start = pm.find_MAP()
print(start)
trace = pm.sample(100000, start=start)
unfortunately, this code takes at N=50 anywhere between 6hours to 2 days to compile. I am running on a pretty fast PC (24Gb RAM) running Ubuntu. I tried to using the GPU but that runs slightly slower. I suspect memory problems since it uses 99.8% of the memory when running. I tried the same calculation with Stan and it only takes 2min to run.

PCA for dimensionality reduction before Random Forest

I am working on binary class random forest with approximately 4500 variables. Many of these variables are highly correlated and some of them are just quantiles of an original variable. I am not quite sure if it would be wise to apply PCA for dimensionality reduction. Would this increase the model performance?
I would like to be able to know which variables are more significant to my model, but if I use PCA, I would be only able to tell what PCs are more important.
Many thanks in advance.
My experience is that PCA before RF is not an great advantage if any. Principal component regression(PCR) is e.g. when, PCA assists to regularize training features before OLS linear regression and that is very needed for sparse data-sets. As RF itself already performs a good/fair regularization without assuming linearity, it is not necessarily an advantage. That said, I found my self writing a PCA-RF wrapper for R two weeks ago. The code includes a simulated data set of a data set of 100 features comprising only 5 true linear components. Under such cercumstances it is infact a small advantage to pre-filter with PCA
The code is a seamless implementation, such that every RF parameters are simply passed on to RF. Loading vector are saved in model_fit to use during prediction.
#I would like to be able to know which variables are more significant to my model, but if I use PCA, I would be only able to tell what PCs are more important.
The easy way is to run without PCA and obtain variable importances and expect to find something similar for PCA-RF.
The tedious way, wrap the PCA-RF in a new bagging scheme with your own variable importance code. Could be done in 50-100 lines or so.
The souce-code suggestion for PCA-RF:
#wrap PCA around randomForest, forward any other arguments to randomForest
#define as new S3 model class
train_PCA_RF = function(x,y,ncomp=5,...) {
f.args=as.list(match.call()[-1])
pca_obj = princomp(x)
rf_obj = do.call(randomForest,c(alist(x=pca_obj$scores[,1:ncomp]),f.args[-1]))
out=mget(ls())
class(out) = "PCA_RF"
return(out)
}
#print method
print.PCA_RF = function(object) print(object$rf_obj)
#predict method
predict.PCA_RF = function(object,Xtest=NULL,...) {
print("predicting PCA_RF")
f.args=as.list(match.call()[-1])
if(is.null(f.args$Xtest)) stop("cannot predict without newdata parameter")
sXtest = predict(object$pca_obj,Xtest) #scale Xtest as Xtrain was scaled before
return(do.call(predict,c(alist(object = object$rf_obj, #class(x)="randomForest" invokes method predict.randomForest
newdata = sXtest), #newdata input, see help(predict.randomForest)
f.args[-1:-2]))) #any other parameters are passed to predict.randomForest
}
#testTrain predict #
make.component.data = function(
inter.component.variance = .9,
n.real.components = 5,
nVar.per.component = 20,
nObs=600,
noise.factor=.2,
hidden.function = function(x) apply(x,1,mean),
plot_PCA =T
){
Sigma=matrix(inter.component.variance,
ncol=nVar.per.component,
nrow=nVar.per.component)
diag(Sigma) = 1
x = do.call(cbind,replicate(n = n.real.components,
expr = {mvrnorm(n=nObs,
mu=rep(0,nVar.per.component),
Sigma=Sigma)},
simplify = FALSE)
)
if(plot_PCA) plot(prcomp(x,center=T,.scale=T))
y = hidden.function(x)
ynoised = y + rnorm(nObs,sd=sd(y)) * noise.factor
out = list(x=x,y=ynoised)
pars = ls()[!ls() %in% c("x","y","Sigma")]
attr(out,"pars") = mget(pars) #attach all pars as attributes
return(out)
}
A run code example:
#start script------------------------------
#source above from separate script
#test
library(MASS)
library(randomForest)
Data = make.component.data(nObs=600)#plots PC variance
train = list(x=Data$x[ 1:300,],y=Data$y[1:300])
test = list(x=Data$x[301:600,],y=Data$y[301:600])
rf = randomForest (train$x, train$y,ntree =50) #regular RF
rf2 = train_PCA_RF(train$x, train$y,ntree= 50,ncomp=12)
rf
rf2
pred_rf = predict(rf ,test$x)
pred_rf2 = predict(rf2,test$x)
cat("rf, R^2:",cor(test$y,pred_rf )^2,"PCA_RF, R^2", cor(test$y,pred_rf2)^2)
cor(test$y,predict(rf ,test$x))^2
cor(test$y,predict(rf2,test$x))^2
pairs(list(trueY = test$y,
native_rf = pred_rf,
PCA_RF = pred_rf2)
)
You can have a look here to get a better idea. The link says use PCA for smaller datasets!! Some of my colleagues have used Random Forests for the same purpose when working with Genomes. They had ~30000 variables and large amount of RAM.
Another thing I found is that Random Forests use up a lot of Memory and you have 4500 variables. So, may be you could apply PCA to the individual Trees.

Matching Numpy and NetCDF4 indexing when creating a netCDF file

I'm trying to move values from a numpy array to a NetCDF file, which I am creating. Currently I'm trying to find the best way to emulate 'fancy indexing' of numpy arrays when creating a netCDF file, but the two indexing systems don't match when the dataset only has two points.
import netCDF4
import numpy as np
rootgrp = netCDF4.Dataset('Test.nc','w',format='NETCDF4')
time = rootgrp.createDimension('time',None)
dim1 = rootgrp.createDimension('dim1',100)
dim2 = rootgrp.createDimension('dim2',100)
dim3 = rootgrp.createDimension('dim3',100)
ncVar = rootgrp.createVariable('ncVar','f4',('time','dim1','dim2','dim3'))
npArr = np.arange(0,10000)
npArr = np.reshape(npArr,(100,100))
So this works just fine:
x,y=np.array(([1,75,10,99],[40,88,19,2]))
ncVar[0,x,y,0] = npArr[x,y]
While this does not:
x,y=np.array(([1,75],[40,88]))
ncVar[0,x,y,0] = npArr[x,y]
These assignments are part of a dynamic loop that determines x,y to create values for ncVar at ~1000 time-steps
EDIT: the issue seems to be that the first case recognizes x,y as defining a series of pts, and so returns a [4,] size array (despite the documentation on netCDF4 'fancy indexing'), while the second interprets them combinatorially and so returns a [2,2] size array (as stated in the documentation). Has anyone run into this or found a workaround?