Model Chain example PVLIB - don't trust in 1 axis tracking AC output - pvlib
I'm trying to use PVLIB to estimate output power for a PV System installed in the west of my country.
As an example I've got 2 days of hourly GHI, 2m Temperature and 10m wind speed from MERRA2 reanalysis.
I want to estimate how much power a fixed PV System or 1 axis tracking system would generate using the forementioned dataset, and ModelChain function from PVLIB. I first estimate DNI and DHI from GHI data using DISC model to obtain DNI and then DHI is the difference between GHI and DNI*cos(Z)
a) First behaviour I am not completely sure if it is Ok. Here is the plot of GHI, DNI , DHI, T2m and Wind Speed. It seems that DNI is shifted with its maximum occurring 1 hour before GHI maximum.
Weather Figure
After preparing irradiance data I calculated AC using Model Chain, specifying the fixed PV System and 1 axis single tracking system.
The thing is that I don't trust in the AC output for a 1-single axis system. I expected a plateau shape of AC output and i found a kind of weird behaviour.
Here is the otuput values of power generation i expected to see:
Expectation
And here is the estimated output by PVLIB
Reality
I hope someone can help me to find the error on my proccedure.
Here is the code:
# =============================================================================
# Example of using MERRA2 data and PVLIB
# =============================================================================
import numpy as np
import pandas as pd
import pandas as pd
import matplotlib.pyplot as plt
import pvlib
from pvlib.pvsystem import PVSystem
from pvlib.location import Location
from pvlib.modelchain import ModelChain
# =============================================================================
# 1) Create small data set extracted from MERRA
# =============================================================================
GHI = np.array([0,0,0,0,0,0,0,0,0,10.8,148.8,361,583,791.5,998.5,1105.5,1146.5,1118.5,1023.5,
860.2,650.2,377.1,165.1,16,0,0,0,0,0,0,0,0,0,11.3,166.2,395.8,624.5,827,986,
1065.5,1079,1025.5,941.5,777,581.5,378.9,156.2,20.6,0,0,0,0])
temp_air = np.array([21.5,20.5,19.7,19.6,18.8,17.9,17.1,16.5,16.2,16.2,17,21.3,24.7,26.9,28.8,30.5,
31.6,32.4,33,33.3,32.9,32,30.6,28.7,25.4,23.9,22.6,21.2,20.3,19.9,19.5,19.1,18.4,
17.7,18.3,23,25.1,27.3,29.5,31.2,32.1,32.6,32.6,32.5,31.8,30.7,29.6,28.1,24.6,22.9,
22.3,23.2])
wind_speed = np.array([3.1,2.7,2.5,2.6,2.8,3,3,3,2.8,2.5,2.1,1,2.2,3.7,4.8,5.6,6.1,6.4,6.5,6.6,6.3,5.8,5.3,
3.7,3.9,4,3.6,3.4,3.4,3,2.6,2.3,2.1,2,2.2,2.7,3.2,4.3,5.1,5.6,5.7,5.8,5.8,5.7,5.4,4.8,
4.4,3.1,2.7,2.3,1.1,0.6])
local_timestamp = pd.DatetimeIndex(start='1979-12-31 21:00', end='1980-01-03 00:00', freq='1h',tz='America/Argentina/Buenos_Aires')
d = {'ghi':GHI,'temp_air':temp_air,'wind_speed':wind_speed}
data = pd.DataFrame(data=d)
data.index = local_timestamp
lat = -31.983
lon = -68.530
location = Location(latitude = lat,
longitude = lon,
tz = 'America/Argentina/Buenos_Aires',
altitude = 601)
# =============================================================================
# 2) SOLAR POSITION AND ATMOSPHERIC MODELING
# =============================================================================
solpos = pvlib.solarposition.get_solarposition(time = local_timestamp,
latitude = lat,
longitude = lon,
altitude = 601)
# DNI and DHI calculation from GHI data
DNI = pvlib.irradiance.disc(ghi = data.ghi,
solar_zenith = solpos.zenith,
datetime_or_doy = local_timestamp)
DHI = data.ghi - DNI.dni*np.cos(np.radians(solpos.zenith.values))
d = {'ghi': data.ghi,'dni': DNI.dni,'dhi': DHI,'temp_air':data.temp_air,'wind_speed':data.wind_speed }
weather = pd.DataFrame(data=d)
plt.plot(weather)
# =============================================================================
# 3) SYSTEM SPECIFICATIONS
# =============================================================================
# load some module and inverter specifications
sandia_modules = pvlib.pvsystem.retrieve_sam('SandiaMod')
cec_inverters = pvlib.pvsystem.retrieve_sam('cecinverter')
sandia_module = sandia_modules['Canadian_Solar_CS5P_220M___2009_']
cec_inverter = cec_inverters['Power_Electronics__FS2400CU15__645V__645V__CEC_2018_']
# Fixed system with tilt=abs(lat)-10
f_system = PVSystem( surface_tilt = abs(lat)-10,
surface_azimuth = 0,
module = sandia_module,
inverter = cec_inverter,
module_parameters = sandia_module,
inverter_parameters = cec_inverter,
albedo = 0.20,
modules_per_string = 100,
strings_per_inverter = 100)
# 1 axis tracking system
t_system = pvlib.tracking.SingleAxisTracker(axis_tilt = 0, #abs(-33.5)-10
axis_azimuth = 0,
max_angle = 52,
backtrack = True,
module = sandia_module,
inverter = cec_inverter,
module_parameters = sandia_module,
inverter_parameters = cec_inverter,
name = 'tracking',
gcr = .3,
modules_per_string = 100,
strings_per_inverter = 100)
# =============================================================================
# 4) MODEL CHAIN USING ALL THE SPECIFICATIONS for a fixed and 1 axis tracking systems
# =============================================================================
mc_f = ModelChain(f_system, location)
mc_t = ModelChain(t_system, location)
# Next, we run a model with some simple weather data.
mc_f.run_model(times=weather.index, weather=weather)
mc_t.run_model(times=weather.index, weather=weather)
# =============================================================================
# 5) Get only AC output form a fixed and 1 axis tracking systems and assign
# 0 values to each NaN
# =============================================================================
d = {'fixed':mc_f.ac,'tracking':mc_t.ac}
AC = pd.DataFrame(data=d)
i = np.isnan(AC.tracking)
AC.tracking[i] = 0
i = np.isnan(AC.fixed)
AC.fixed[i] = 0
plt.plot(AC)
I hope anyone could help me with the intepretation of the results and debugging of the code.
Thanks a lot!
I suspect your issue is due to the way the hourly data is treated. Be sure that you're consistent with the interval labeling (beginning/end) and treatment of instantaneous vs. average data. One likely cause is using hourly average GHI data to derive DNI data. pvlib.solarposition.get_solarposition returns the solar position at the instants in time that are passed to it. So you're mixing up hourly average GHI values with instantaneous solar position values when you use pvlib.irradiance.disc to calculate DNI and when you calculate DHI. Shifting your time index by 30 minutes will reduce, but not eliminate, the error. Another approach is to resample the input data to be of 1-5 minute resolution.
Related
Why do loss valued increase after some epochs in sampled_softmax_loss
I'm using Tensorflow to train a word2vec skip gram model. The computation graph is in the code below: # training data self.dataset = tf.data.experimental.make_csv_dataset(file_name, batch_size=self.batch_size, column_names=['input', 'output'], header=False, num_epochs=self.epochs) self.datum = self.dataset.make_one_shot_iterator().get_next() self.inputs, self.labels = self.datum['input'], self.datum['output'] # embedding layer self.embedding_g = tf.Variable(tf.random_uniform((self.n_vocab, self.n_embedding), -1, 1)) self.embed = tf.nn.embedding_lookup(self.embedding_g, self.inputs) # softmax layer self.softmax_w_g = tf.Variable(tf.truncated_normal((self.n_context, self.n_embedding))) self.softmax_b_g = tf.Variable(tf.zeros(self.n_context)) # Calculate the loss using negative sampling self.labels = tf.reshape(self.labels, [-1, 1]) self.loss = tf.nn.sampled_softmax_loss( weights=self.softmax_w_g, biases=self.softmax_b_g, labels=self.labels, inputs=self.embed, num_sampled=self.n_sampled, num_classes=self.n_context) self.cost = tf.reduce_mean(self.loss) self.optimizer = tf.train.AdamOptimizer().minimize(self.cost) But after 25 epochs, loss values begin to increase. Is there any reason for this?
My neural network takes too much time to train one epoch
I am training a neural network which tries to classify a traffic signs, but it takes too much time to train only one epoch, maybe 30+ mins for just one epoch, I have set the batch size to 64 and the learning rate to be 0.002, the input is 20x20 pixels with 3 channels, and the model summary shows that it is training 173,931 parameters, is that too much or good? Here is the network architecture import torch.nn as nn import torch.nn.functional as F from torchsummary import summary class Network(nn.Module): def __init__(self): super(Network,self).__init__() #Convolutional Layers self.conv1 = nn.Conv2d(3,16,3,padding=1) self.conv2 = nn.Conv2d(16,32,3,padding=1) #Max Pooling Layers self.pool = nn.MaxPool2d(2,2) #Linear Fully connected layers self.fc1 = nn.Linear(32*5*5,200) self.fc2 = nn.Linear(200,43) #Dropout self.dropout = nn.Dropout(p=0.25) def forward(self,x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1,32*5*5) x = self.dropout(x) x = F.relu(self.fc1(x)) x = self.dropout(x) x = self.fc2(x) return x Here is the optimizer instance import torch.optim as optim criterion = nn.CrossEntropyLoss() optim = optim.SGD(model.parameters(),lr = 0.002) Here is the training code epochs = 20 valid_loss_min = np.Inf print("Training the network") for epoch in range (1,epochs+1): train_loss = 0 valid_loss = 0 model.train() for data,target in train_data: if gpu_available: data,target = data.cuda(),target.cuda() optim.zero_grad() output = model(data) loss = criterion(output,target) loss.backward() optim.step() train_loss += loss.item()*data.size(0) ######################### ###### Validate ######### model.eval() for data,target in valid_data: if gpu_available: data,target = data.cuda(),target.cuda() output = model(data) loss = criterion(output,target) valid_loss += loss.item()*data.size(0) train_loss = train_loss/len(train_data.dataset) valid_loss = train/len(valid_data.dataset) print("Epoch {}.....Train Loss = {:.6f}....Valid Loss = {:.6f}".format(epoch,train_loss,valid_loss)) if valid_loss < valid_loss_min: torch.save(model.state_dict(), 'model_traffic.pt') print("Valid Loss min {:.6f} >>> {:.6f}".format(valid_loss_min, valid_loss)) I am using GPU through google colab
QuantLib (Python) ZeroCouponBond. Appropriate yield curve
I want to find the NPV of a ZeroCouponBond in Quantlib. I am adapting the code from https://quant.stackexchange.com/q/32539 for FixedRateBonds. The code below runs (82.03), but I am not sure which compoundingFrequency to set for the term structure in the case of a zero coupon bond. The only thing that makes sense to me is to set the discount factors to annual compouding. Or is there anything particular about using ZeroCouponBond together with ZeroCurve that I am overlooking? from QuantLib import * # Construct yield curve calc_date = Date(1, 1, 2017) Settings.instance().evaluationDate = calc_date spot_dates = [Date(1,1,2017), Date(1,1,2018), Date(1,1,2027)] spot_rates = [0.04, 0.04, 0.04] day_count = SimpleDayCounter() calendar = NullCalendar() interpolation = Linear() compounding = Compounded compounding_frequency = Annual spot_curve = ZeroCurve(spot_dates, spot_rates, day_count, calendar, interpolation, compounding, compounding_frequency) spot_curve_handle = YieldTermStructureHandle(spot_curve) # Construct bond schedule issue_date = Date(1, 1, 2017) maturity_date = Date(1, 1, 2022) settlement_days = 0 face_value = 100 bond = ZeroCouponBond(settlement_days, # calendar calendar, # faceamout face_value, # maturity_date maturity_date, # paymentconvention Following, # redemption face_value, # issue date issue_date ) # Set Valuation engine bond_engine = DiscountingBondEngine(spot_curve_handle) bond.setPricingEngine(bond_engine) # Calculate present value value = bond.NPV()
The frequency doesn't depend on the fact that the bond is a zero-coupon bond; it depends on how the rates you're using were calculated or quoted. If the 4% was calculated or quoted as an annually compounded rate, that's what you should use; otherwise, you'll have to determine what "4%" means.
Joining of curve fitting models
I have this 7 quasi-lorentzian curves which are fitted to my data. and I would like to join them, to make one connected curved line. Do You have any ideas how to do this? I've read about ComposingModel at lmfit documentation, but it's not clear how to do this. Here is a sample of my code of two fitted curves. for dataset in [Bxfft]: dataset = np.asarray(dataset) freqs, psd = signal.welch(dataset, fs=266336/300, window='hamming', nperseg=16192, scaling='spectrum') plt.semilogy(freqs[0:-7000], psd[0:-7000]/dataset.size**0, color='r', label='Bx') x = freqs[100:-7900] y = psd[100:-7900] # 8 Hz model = Model(lorentzian) params = model.make_params(amp=6, cen=5, sig=1, e=0) result = model.fit(y, params, x=x) final_fit = result.best_fit print "8 Hz mode" print(result.fit_report(min_correl=0.25)) plt.plot(x, final_fit, 'k-', linewidth=2) # 14 Hz x2 = freqs[220:-7780] y2 = psd[220:-7780] model2 = Model(lorentzian) pars2 = model2.make_params(amp=6, cen=10, sig=3, e=0) pars2['amp'].value = 6 result2 = model2.fit(y2, pars2, x=x2) final_fit2 = result2.best_fit print "14 Hz mode" print(result2.fit_report(min_correl=0.25)) plt.plot(x2, final_fit2, 'k-', linewidth=2) UPDATE!!! I've used some hints from user #MNewville, who posted an answer and using his code I got this: So my code is similar to his, but extended with each peak. What I'm struggling now is replacing ready LorentzModel with my own. The problem is when I do this, the code gives me an error like this. C:\Python27\lib\site-packages\lmfit\printfuncs.py:153: RuntimeWarning: invalid value encountered in double_scalars [[Model]] spercent = '({0:.2%})'.format(abs(par.stderr/par.value)) About my own model: def lorentzian(x, amp, cen, sig, e): return (amp*(1-e)) / ((pow((1.0 * x - cen), 2)) + (pow(sig, 2))) peak1 = Model(lorentzian, prefix='p1_') peak2 = Model(lorentzian, prefix='p2_') peak3 = Model(lorentzian, prefix='p3_') # make composite by adding (or multiplying, etc) components model = peak1 + peak2 + peak3 # make parameters for the full model, setting initial values # using the prefixes params = model.make_params(p1_amp=6, p1_cen=8, p1_sig=1, p1_e=0, p2_ampe=16, p2_cen=14, p2_sig=3, p2_e=0, p3_amp=16, p3_cen=21, p3_sig=3, p3_e=0,) rest of the code is similar like at #MNewville [![enter image description here][3]][3]
A composite model for 3 Lorentzians would look like this: from lmfit import Model, LorentzianModel peak1 = LorentzianModel(prefix='p1_') peak2 = LorentzianModel(prefix='p2_') peak3 = LorentzianModel(prefix='p3_') # make composite by adding (or multiplying, etc) components model = peak1 + peaks2 + peak3 # make parameters for the full model, setting initial values # using the prefixes params = model.make_params(p1_amplitude=10, p1_center=8, p1_sigma=3, p2_amplitude=10, p2_center=15, p2_sigma=3, p3_amplitude=10, p3_center=20, p3_sigma=3) # perhaps set bounds to prevent peaks from swapping or crazy values params['p1_amplitude'].min = 0 params['p2_amplitude'].min = 0 params['p3_amplitude'].min = 0 params['p1_sigma'].min = 0 params['p2_sigma'].min = 0 params['p3_sigma'].min = 0 params['p1_center'].min = 2 params['p1_center'].max = 11 params['p2_center'].min = 10 params['p2_center'].max = 18 params['p3_center'].min = 17 params['p3_center'].max = 25 # then do a fit over the full data range result = model.fit(y, params, x=x) I think the key parts you were missing were: a) just add models together, and b) use prefix to avoid name collisions of parameters. I hope that is enough to get you started...
Recursively select elements in numpy array
I have a text file containing latitude and temperature values for points around the globe. I would like to take the average of all the temperature points between a specified latitude interval (i.e. every degree, from the South to the North Pole). This is the code I have so far: data_in = genfromtxt('temperatures.txt', usecols = (0,1)) lat = data_in[:,0] temp = data_in[:,1] in_1N = np.where((lat>=0) & (lat<=1)) # outputs an array of indexes for all latitudes between 0°-1° North temp_1N = temp[in_1N] # outputs an array of temperature values between 0°-1° North avg_1N = np.nanmean(temp_1N) # works out the average of temperatures between 0°-1° North plt.scatter(1, avg_1N) # plots the average temperature against the latitude interval plt.show() How could I improve this code, so it can be implemented 180 times to cover the Earth between -90°S and 90°N? Thanks
You could use np.histogram to put the latitudes into bins. Usually, np.histogram would merely count the number of latitudes in each bin. But if you weight the latitudes by the associated temp value, then instead of a count you get the sum of the temps. If you divide the sum of temps by the bin count, you get the average temp in each bin: import numpy as np import matplotlib.pyplot as plt # N = 100 # lat = np.linspace(-90, 90, N) # temp = 50*(1-np.cos(np.linspace(0, 2*np.pi, N))) # temp[::5] = np.nan # np.savetxt(filename, np.column_stack([lat, temp])) lat, temp = np.genfromtxt('temperatures.txt', usecols = (0,1), unpack=True) valid = np.isfinite(temp) lat = lat[valid] temp = temp[valid] grid = np.linspace(-90, 90, 40) count, bin_edges = np.histogram(lat, bins=grid) temp_sum, bin_edges = np.histogram(lat, bins=grid, weights=temp) temp_avg = temp_sum / count plt.plot(bin_edges[1:], temp_avg, 'o') plt.show() Note that if you have scipy installed, then you could replace the two calls to np.histogram: count, bin_edges = np.histogram(lat, bins=grid) temp_sum, bin_edges = np.histogram(lat, bins=grid, weights=temp) with one call to stats.binned_statistic: import scipy.stats as stats temp_avg, bin_edges, binnumber = stats.binned_statistic( x=lat, values=temp, statistic='mean', bins=grid)