Recently I inherited quite a few old QuickBasic programs which perform various astronomical calculations. I'm attempting to understand these programs and rewrite some of them in Python. I do not have a deep background in astronomy.
A number of the programs take a parameter file as input, YEAR.DAT. Below are 5 years of these files (each column represents one file). I need help in figuring out the various data values.
YEAR.DAT
year 2001 2008 2009 2010 2011
delta t 66 65 66 66 67
tilt 23.43909 23.43818 23.43805 23.43799 23.43786
dow 1 2 4 5 6
gst 6.71430 6.66860 6.71839 6.702509 6.68659
x1 105.690 330.340 310.959 291.631 272.303
bs 84 90 88 87 86
fs 301 300 298 304 303
x2 357.765 356.959 357.689 357.433 357.177
x3 354.289 193.159 335.720 105.105 234.489
jd 2451910.5 2454466.5 2454832.5 2455197.5 2455562.5
I believe that all the values which are time dependent are for 0:00 hours on Jan. 1 of the year given.
Here are the values I think I've figured out:
tilt is the obilquity of the ecliptic
dow is the day of the week, where Monday is day 1
bs is the number of the day of the year when British Summer Time (BST) begins
fs is the number of the day of the year when BST ends
jd is the Julian day number (of 0:00 hours Jan. 1)
Values I'm unsure about:
delta t is some sort of time delta, but I don't know what
gst seems to be Greenwich Mean Sidereal Time, but for what moment?
x1, x2, and x3 I'm clueless about
Here are my questions:
What might delta t be?
Is gst in fact Greenwich Mean Sidereal Time? For what moment?
What are x1, x2, and x3? (This is a low-priority question.)
How can delta t, gst, and, perhaps other values, be determined for
2018, 2019, ...?
Any help will be greatly appreciated.
Roger House
Related
I have gathered satellite data (every 5 minutes, from "Solcast") for GHI, DNI and DHI and I use pvlib to get the POA value.
The pvlib function I use:
def get_irradiance(site_location, date, tilt, surface_azimuth, ghi, dni, dhi):
times = pd.date_range(date, freq='5min', periods=12*24, tz=site_location.tz)
solar_position = site_location.get_solarposition(times=times)
POA_irradiance = irradiance.get_total_irradiance(
surface_tilt=tilt,
surface_azimuth=surface_azimuth,
ghi=ghi,
dni=dni,
dhi=dhi,
solar_zenith=solar_position['apparent_zenith'],
solar_azimuth=solar_position['azimuth'])
return pd.DataFrame({'GHI': ghi,
'DNI': dni,
'DHI': dhi,
'POA': POA_irradiance['poa_global']})
When I compare GHI and POA values for 12 June 2022 and 13 June 2022 is see the POA value for 12 June is significantly behind the GHI. The location is in The Netherlands, I use a tilt of 12.5 degrees and an azimuth of 180 degrees. Here is the outcome (per hour, from 6:00 - 20:00):
12 Juni 2022
GHI DNI DHI POA
6 86.750000 312.750000 40.500000 40.277034
7 224.583333 543.000000 69.750000 71.130218
8 366.833333 598.833333 113.833333 178.974322
9 406.083333 182.000000 304.000000 348.272844
10 532.166667 266.750000 346.666667 445.422584
11 725.666667 640.416667 226.500000 509.360716
12 688.500000 329.416667 409.583333 561.630762
13 701.333333 299.750000 439.333333 570.415438
14 725.416667 391.666667 387.750000 532.529676
15 753.916667 629.166667 244.333333 407.665794
16 656.750000 599.750000 215.333333 293.832376
17 381.833333 36.416667 359.416667 356.317883
18 411.750000 569.166667 144.750000 144.254438
19 269.750000 495.916667 102.500000 102.084439
20 134.583333 426.416667 51.583333 51.370738
And
13 June 2022
GHI DNI DHI POA
6 5.666667 0.000000 5.666667 5.616296
7 113.500000 7.750000 111.416667 111.948831
8 259.500000 106.833333 208.416667 256.410392
9 509.166667 637.750000 150.583333 514.516389
10 599.333333 518.666667 240.583333 619.050821
11 745.250000 704.500000 195.583333 788.773772
12 757.250000 549.666667 292.000000 798.739403
13 742.000000 464.583333 335.000000 778.857394
14 818.250000 667.750000 243.000000 869.972769
15 800.750000 776.833333 166.916667 852.559043
16 699.000000 733.666667 167.166667 730.484502
17 582.666667 729.166667 131.916667 593.802853
18 449.166667 756.583333 83.500000 434.958210
19 290.083333 652.666667 68.666667 254.048655
20 139.833333 466.916667 48.333333 97.272684
What can be an explanation of the significantly low POA compared to the GHI values on 12 June?
I have this outcome with other days too: some days have a POA much closer to the GHI than other days. Maybe this is "normal behaviour" and I do not reckon with weather influences which maybe important...
I use the POA to do a PR (Performance Ratio) calculation but I do not get "trusted" results..
Hope someone can shine a light on these values.
Kind regards,
Oscar
The Netherlands.
I'm really sorry, although the weather is unpredictable in the Netherlands I made a very big booboo in using dd-mm-yyyy format instead of mm-dd-yyyy. Something I overlooked for a long time...(I never had used mm-dd-yyyy, but that's a lame excuse...)
Really sorry, hope you did not think about it too long..
Thank you anyway for reacting!
I've good values now!
Oscar (shame..)
I am working with three-dimensional macroeconomic panel data in Stata. My data is compiled from 51 issues of Economic Outlook(EO) from OECD, each containing data for up to 30 countries from 1960 up to 2010, where the first issue is from 1985 and the last issue is from 2010. The issues are released semiannualy and each issue has historic data as well as forecast 2 periods ahead. So for each variable there are essentially three subscripts: country (i), time the data is concerning (t), time the data was released (r).
I want to identify a fiscal policy shock as a forecast error: the forecast of public spending minus the realized value from the EO issue one period later. So, for the forecasted value, t=r-1, while for the realized value, t=r. For public spending, g, the forecast error should look like:
g_i,t,r(t=r-1) - g_i,t,r(t=r)
(if that makes sense).
I have never worked with three-dimensional panel data, so I don't know how to code with it. Currently my data looks like this:
time_str value frequency location variable year eo year_half eo_year var_cat eo_half time_cal time_eo tt_cal tt_eo id_cal id_eo time_actual
1970_1 16214 S CAN cg 1970 38 1 1985 Govt final cons expen, val, GDP exp approach 2 1970 1985.5 21 1 1 504 1970h1
1970_2 17046 S CAN cg 1970 38 2 1985 Govt final cons expen, val, GDP exp approach 2 1970.5 1985.5 22 1 1 530 1970h2
1971_1 17768 S CAN cg 1971 38 1 1985 Govt final cons expen, val, GDP exp approach 2 1971 1985.5 23 1 1 556 1971h1
1971_2 18968 S CAN cg 1971 38 2 1985 Govt final cons expen, val, GDP exp approach 2 1971.5 1985.5 24 1 1 582 1971h2
1972_1 19442 S CAN cg 1972 38 1 1985 Govt final cons expen, val, GDP exp approach 2 1972 1985.5 25 1 1 608 1972h1
1972_2 21140 S CAN cg 1972 38 2 1985 Govt final cons expen, val, GDP exp approach 2 1972.5 1985.5 26 1 1 634 1972h2
1973_1 22274 S CAN cg 1973 38 1 1985 Govt final cons expen, val, GDP exp approach 2 1973 1985.5 27 1 1 660 1973h1
1973_2 23800 S CAN cg 1973 38 2 1985 Govt final cons expen, val, GDP exp approach 2 1973.5 1985.5 28 1 1 686 1973h2
Some explanation of the data:
tt_eo = id for EO issue. In the shown example all the data is from the first issue released in 1985
tt_cal = id for the actual time (when the data is concerned)
id_eo = id for each country-variable within each actual period (time of the release changes)
id_cal = id for each country-variable within each EO issue (actual time for when the data is concerned changes)
time_eo = time of release
time_cal = actual time the data is concerned)
All my variables are not listed as variables but rather values of the variable "variable". Therefore I cannot generate anything or call on them, as Stata doesn't recognize them.
I have tried setting the data (see code below) but I still don't know how to work with the data.
*converting to time data and setting the time
gen time_actual = yh(year, year_half)
xtset id_cal time_actual, format(%th)
Does anyone have any suggestions on how to generate my forecast error variables (or generally how to work with this type of data)?
I should be able to make a report concerning a relationship between sick leaves (days) and man-years. Data is on monthly level, consists of four years and looks like this (there is also own columns for year and business unit):
Month Sick leaves (days) Man-years
January 35 1,5
February 0 1,63
March 87 1,63
April 60 2,4
May 44 2,6
June 0 1,8
July 0 1,4
August 51 1,7
September 22 1,6
October 64 1,9
November 70 2,2
December 55 2
It has to be possible for the user to filter year, month, as well as business unit and get information about sick leave days during the filtered time period (and in selected business unit) compared to the total sum of man-years in the same period (and unit). Calculated from the test data above, the desired result should be 488/22.36 = 21.82
However, I have not managed to do what I want. The main problem is, that calculation takes into account only those months with nonzero sick leave days and ignores man-years of those months with zero days of sick leaves (in example data: February, June, July). I have tried several alternative functions (all, allselected, filter…), but results remain poor. So all information about a better solution will be highly appreciated.
It sounds like this has to do with the way DAX handles blanks (https://www.sqlbi.com/articles/blank-handling-in-dax/). Your context is probably filtering out the rows with blank values for "Sick-days". How to resolve this depends on how your data are structured, but you could try using variables to change your filter context or use "IF ( ISBLANK ( ... ) )" to make sure you're counting the blank rows.
I'm writing my master thesis on the costs of occupational injuries. As a part of the thesis I have estimated the expected wage loss for each person for every year for four years after the injure. I would like to discount the estimated losses to a specific base year (2009) in SAS.
For the year 2009 the discounted loss is just equal the estimated loss. For 2010 and on the discounted loss can be calculated with the netpv function:
IF year=2009 then discount_loss=wage;
IF year=2010 then discount_loss=netpv(0.1,1,0,wage);
IF year=2011 then discount_loss=netpv(0.1,1,0,0,wage);
And so forth. But starting from 2014 I would like to use the estimated wage loss for 2014 as the expected loss onward - so for instance if the estimated loss is 100$ that would represent the yearly loss until retirement. Since each person don't have the same age there would be too many ways just to hard code, so I'm looking for a better way. There are approximately 200.000 persons in my data set with different estimated losses for each year.
The format of the (fictional) data looks like this:
id age year age_retirement wage_loss rate discount_loss
1 35 2009 65 -100 0.1 -100
1 36 2010 65 -100 0.1 -90,91
1 37 2011 65 -100 0.1 -82,64
1 38 2012 65 -100 0.1 -75,13
1 39 2013 65 -100 0.1 -68,30
1 40 2014 65 -100 0.1
The column discount_loss is the net present value of the loss i 2009. Calculated as above.
I would like the loss in 2014 to represent the sum of losses for the rest of the period (until age_retirement) on the labor market. That would be -100$ discounted for 2009 starting from 2014 until 2014+(65-40).
Thanks!
Use the FINANCE function for PV, Present Value.
In your situation above, you're looking for the value of 100 for 25 years of payments (65-40)=25. I'll leave the calculation of the number of years up to you.
FINANCE('PV', rate, nper, payment, <fv>, <type>);
In your case, Future Value is 0 and the type=1 as you assume payment at the beginning of the year.
The formula below calculates the present value of a series of 100 payments over 25 years with a 10% interest rate and paid at the beginning of the period.
value=FINANCE('PV', 0.1, 25, -100, 0, 1);
Value = 998.47440201
Reference is here:
https://support.sas.com/documentation/cdl/en/lefunctionsref/67960/HTML/default/viewer.htm#p1cnn1jwdmhce0n1obxmu4iq26ge.htm
If you are looking for speed why not first calculate an array that contains the PV of $1 for for i years where i goes from 1 to n. Then just select the element you need and multiply. This could all be done in a data step.
I have a transaction level dataset and I want to collapse and calculate weekly average price. The dataset can be simplified as follows,
clear
input str9 date quantity price id
"01jan2010" 50 70 1
"02jan2010" 60 80 2
"02jan2010" 70 90 3
"04jan2010" 70 95 4
"08jan2010" 60 81 5
"09jan2010" 70 88 6
"12jan2010" 55 87 7
"13jan2010" 52 88 8
end
gen date2=date(date,"DMY")
format date2 %td
drop date
I want to create a variable date3. For every transaction happened in a week, date3 is the Monday of that week.
Here's the code I have:
sort date2
gen date3=date2 if dow(date2)==1
replace date3=date3[_n-1] if missing(date3)
format date3 %td
However, there are Mondays with no transactions, but the rest of the week has transactions. In those cases, date3 is not the Monday date of that week, but Monday date in the weeks before.
My data becomes the following using the above code:
quantity price id date2 date3
50 70 1 01jan2010
60 80 2 02jan2010
70 90 3 02jan2010
70 95 4 04jan2010 04jan2010
60 81 5 08jan2010 04jan2010
70 88 6 09jan2010 04jan2010
55 87 7 12jan2010 04jan2010
52 88 8 13jan2010 04jan2010
To me, it does not matter if id =1,2,3 have no date3. What I am concerned is that id=7 and id=8 should have a date3 of 11jan2010. But because there is no transaction on that day, the date becomes 04jan2010. Is there a way to fix this?
(I was thinking of constructing a new dataset with consecutive dates since 01jan2010 and then merge with the one above, and then drop if missing quantity of price. But I was wondering if there's a more efficient way).
In addition, I have a weekly index data that reports on every Friday since 01jan2010. If I use wofd command, Stata will generate 53 weeks in 2010. (Or more precisely, two 2010w52.) How can I get just 52 weeks in Stata?
(I found this http://www.stata.com/statalist/archive/2012-02/msg01030.html but I still cannot figure out how this can help solve my problem. )
Your weeks start on Mondays. Everything you need follows from using dow() to exploit the fact that in every one of your weeks, the day of week function dow() yields 1, 2, 3, 4, 5, 6, 0 for the days from Monday to Sunday.
The present or previous Monday for daily dates daily is just
gen Monday = cond(dow(daily) == 0, daily - 6, daily - dow(daily) + 1)
The branch is like this. If it's a Sunday, the previous Monday was 6 days ago. Otherwise, the Monday that starts the week was today if it's Monday and dow() yields 1, yesterday if it's Tuesday and 2, and so forth. Here the variable Monday is just the dates of Mondays that define the weeks.
Important detail: There are no assumptions here about dates being complete in the data or even in order.
Small note: Arbitrary names like date2 and date3 mean nothing much. Use evocative names in your questions (and your practice).
There was a sequel to the article mentioned by Robert Ferrer. search week, sj in Stata to get the references.
Do not use Stata's weeks and in particular do not use the wofd() function (not a command), as they can't help you. Stata's weeks will not map on to your weeks. The article mentioned by Robert Ferrer really is worthwhile reading to understand this (even though I wrote it).
(This is all explained in the Statalist threads you link to.)