I am trying to scrape time series data using pandas DataFrame for Python 2.7 from the web page (http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm). Could somebody please help me how I can write the code. Thanks!
I tried my code as follows:
html =urllib.urlopen("http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm");
text= html.read();
df=pd.DataFrame(index=datum, columns=['m_ta','m_tax','m_taxd', 'm_tan','m_tand'])
But it doesn't give anything. Here I want to display the table as it is.
You can use BeautifulSoup for parsing all font tags, then split column a, set_index from column idx and rename_axis to None - remove index name:
import pandas as pd
import urllib
from bs4 import BeautifulSoup
html = urllib.urlopen("http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm");
soup = BeautifulSoup(html)
#print soup
fontTags = soup.findAll('font')
#print fontTags
#get text from tags fonts
li = [x.text for x in soup.findAll('font')]
#remove first 13 tags, before not contain necessary data
df = pd.DataFrame(li[13:], columns=['a'])
#split data by arbitrary whitspace
df = df.a.str.split(r'\s+', expand=True)
#set column names
df.columns = columns=['idx','m_ta','m_tax','m_taxd', 'm_tan','m_tand']
#convert column idx to period
df['idx'] = pd.to_datetime(df['idx']).dt.to_period('M')
#convert columns to datetime
df['m_taxd'] = pd.to_datetime(df['m_taxd'])
df['m_tand'] = pd.to_datetime(df['m_tand'])
#set column idx to index, remove index name
df = df.set_index('idx').rename_axis(None)
print df
m_ta m_tax m_taxd m_tan m_tand
1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10
1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15
1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01
1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23
1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05
1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17
1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04
1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29
1901-09 15.9 19.9 1901-09-01 11.8 1901-09-09
1901-10 12.6 17.9 1901-10-04 8.3 1901-10-31
1901-11 4.7 11.1 1901-11-14 -0.2 1901-11-26
1901-12 4.2 8.4 1901-12-22 -1.4 1901-12-07
1902-01 3.4 7.5 1902-01-25 -2.2 1902-01-15
1902-02 2.8 6.6 1902-02-09 -2.8 1902-02-06
1902-03 5.3 13.3 1902-03-22 -3.5 1902-03-13
1902-04 10.5 15.8 1902-04-21 6.1 1902-04-08
1902-05 12.5 20.6 1902-05-31 8.5 1902-05-10
1902-06 18.5 23.8 1902-06-30 14.4 1902-06-19
1902-07 20.2 25.2 1902-07-01 15.5 1902-07-03
1902-08 21.1 25.4 1902-08-07 14.7 1902-08-13
1902-09 16.1 23.8 1902-09-05 9.5 1902-09-24
1902-10 10.8 15.4 1902-10-12 4.9 1902-10-25
1902-11 2.4 9.1 1902-11-01 -4.2 1902-11-18
1902-12 -3.1 7.2 1902-12-27 -17.6 1902-12-15
1903-01 -0.5 8.3 1903-01-11 -11.5 1903-01-23
1903-02 4.6 13.4 1903-02-23 -2.7 1903-02-17
1903-03 9.0 16.1 1903-03-28 4.9 1903-03-09
1903-04 9.0 16.5 1903-04-29 2.6 1903-04-19
1903-05 16.4 21.2 1903-05-03 11.3 1903-05-19
1903-06 19.0 23.1 1903-06-03 15.6 1903-06-07
... ... ... ... ... ...
1998-07 22.5 30.7 1998-07-23 15.0 1998-07-09
1998-08 22.3 30.5 1998-08-03 14.8 1998-08-29
1998-09 16.0 21.0 1998-09-12 10.4 1998-09-14
1998-10 11.9 17.2 1998-10-07 8.2 1998-10-27
1998-11 3.8 8.4 1998-11-05 -1.6 1998-11-21
1998-12 -1.6 6.2 1998-12-14 -8.2 1998-12-26
1999-01 0.6 4.7 1999-01-15 -4.8 1999-01-31
1999-02 1.5 6.9 1999-02-05 -4.8 1999-02-01
1999-03 8.2 15.5 1999-03-31 3.0 1999-03-16
1999-04 13.1 17.1 1999-04-16 6.1 1999-04-18
1999-05 17.2 25.2 1999-05-31 11.1 1999-05-06
1999-06 19.8 24.4 1999-06-07 12.2 1999-06-22
1999-07 22.3 28.0 1999-07-06 16.3 1999-07-23
1999-08 20.6 26.7 1999-08-09 17.3 1999-08-23
1999-09 19.3 22.9 1999-09-26 15.0 1999-09-02
1999-10 11.5 19.0 1999-10-03 5.7 1999-10-18
1999-11 3.9 12.6 1999-11-04 -2.2 1999-11-21
1999-12 1.3 6.4 1999-12-13 -8.1 1999-12-25
2000-01 -0.7 8.7 2000-01-31 -6.6 2000-01-25
2000-02 4.5 10.2 2000-02-01 -0.1 2000-02-23
2000-03 6.7 11.6 2000-03-09 0.6 2000-03-17
2000-04 14.8 22.1 2000-04-21 5.8 2000-04-09
2000-05 18.7 23.9 2000-05-27 12.3 2000-05-22
2000-06 21.9 29.3 2000-06-14 15.4 2000-06-17
2000-07 20.3 26.6 2000-07-03 14.0 2000-07-16
2000-08 23.8 29.7 2000-08-20 18.5 2000-08-31
2000-09 16.1 21.5 2000-09-14 12.7 2000-09-24
2000-10 14.1 18.7 2000-10-04 8.0 2000-10-23
2000-11 9.0 14.9 2000-11-15 3.7 2000-11-30
2000-12 3.0 9.4 2000-12-14 -6.8 2000-12-24
[1200 rows x 5 columns]
Related
I have two data sets having the same content but one is in tab-delimited format, and the other is in space-delimited format.
Space-Delimited
Tab_Delimited
I have three questions which I could not figure them out and would like to ask for help. Any suggestions would be highly appreciated.
First, I used the TextWrangler to open these two data sets, and I feel that the space-delimited data set means that the data sets are separated by spaces and the observations each row are in the same position.
On the other hand, my understanding for tab-delimited data set was that the data sets which are separated by blanks and the blanks might not be necessary the same widths for each rows of the variables. Was my understanding correct? I am having trouble distinguishing them.
Second, I was printing out the snowfall dataset as mentioned above from row number 5 to row number 122, and the "T" values in the dataset has to
be converted to 0.
My code for the space-delimited file of the snowfall data was as below,
and my question was about its LOG. There were many warnings about "T" but I did not receive any errors.
LOG
Should I be concerned about the warnings here mentioning
"invalid data for month(i) in line..."
* Trying Space-Delimited data set;
OPTIONS Errors=200;
DATA SASWEEK.SnowSpace;
DROP i MyTot diff;
INFILE "&dirLSB.RochesterSnowfallSpace.txt" FIRSTOBS= 2 OBS= 122;
INPUT Season $ Sep Oct Nov Dec Jan Feb Mar Apr May Total ;
ARRAY Month(10) Sep -- Total;
DO i = 1 TO 10 ;
IF Month(i) = . THEN Month(i) = 0 ;
MyTot = sum (of Sep -- May);
diff = round (MyTot-Total, 3);
IF diff ne 0 THEN PUT "**ERROR" MyTot= Total= diff= ;
END;
PROC PRINT DATA=sasweek.snowspace;
TITLE "Rochester Snowfall in Space-Delimited format";
RUN;
One of my professors suggested I should have made the monthly snowfall as "character". So the "T"s would not incur a warning in the LOG. I am not sure whether I should try it this way.
Lastly, I tried to use "Proc Import" for the same data set but in xls file.
The data set is as the link
And my code is as follows:
* Trying Excel file ;
OPTIONS ERRORS=200;
OPTIONS MSGLEVEL=i;
PROC IMPORT OUT=SASWEEK.SNOWxls
DATAFILE= "&dirLSB.RochesterSnowfall.xls" DBMS=xls;
GETNAMES= no;
RANGE= "Sheet1$a5:k122" ;
PROC PRINT DATA= SASWEEK.SNOWxls;
TITLE "Rochester Snowfall in xls format";
RUN;
I received the error in the LOG saved as the HTML
I still printed out a part of the dataset but the variable names were messed up and the output was not complete.
Any ideas?
Thank you all for your reading and thanks for any help:)
The DATA step with INPUT statement might be the best place to start.
WARNINGs are fine, unless the goal is to have no warnings.
The data file can be cleanly read by creating an input environment built for it:
Custom informat zeroT converts T(text) to 0(number). Prevents warnings.
INFILE
DLM='0920'x specifying either tab or space may be delimiting data file values.
INPUT
Wrap fields Sep to Total in parenthesis ( ) to indicate grouped input
Wrap informat specifiers in parenthesis ( ) that are applied over grouped variables
: list input modifier that advances input parsing to next non-blank and reads until next character is blank.
Sample Code
proc format;
invalue zeroT 'T'=0 other=[best12.];
run;
data have;
infile snowdata firstobs=2 dlm='0920'x;
INPUT Season $ (Sep Oct Nov Dec Jan Feb Mar Apr May Total) (10 * :zeroT.) ;
run;
Sample Data (from SP text viewer)
filename snowdata "%TEMP%\roc_snowfalls.txt";
* create local sample data file, text copied from sharepoint viewer;
data _null_;
file snowdata;
input;
put _infile_;
datalines;
Season Sep Oct Nov Dec Jan Feb Mar Apr May Total
1884-85 0 T 1 27.1 22.2 17 3.5 19.5 T 90.3
1885-86 0 1.7 8.2 8.4 16.9 16 6.5 7 0 64.7
1886-87 0 T 22.2 12.5 12 18.4 6.3 1.2 0 72.6
1887-88 0 0.2 2.2 9.3 21.3 4.1 13.2 0.4 0 50.7
1888-89 0 T 4 15.5 17.8 22 17.5 5.4 0 82.2
1889-90 0 T 5.7 6.1 20.2 14.8 19 T 0 65.8
1890-91 0 0 2.1 29.2 16.1 24.6 12.2 0.3 0.1 84.6
1891-92 0 0.1 9.7 4.7 26.4 10.3 25.1 0.8 T 77.1
1892-93 0 T 14 19.2 15.9 29.8 8.1 9.6 0 96.6
1893-94 0 0.5 6.1 27.6 20 29.5 5.4 13.3 0 102.4
1894-95 0 T 11.1 22.1 26.5 23.6 9.5 0.6 0 93.4
1895-96 0 1.5 5.9 8.7 22.5 39.1 45.1 1 0 123.8
1896-97 0 T 5.5 13.9 20.1 13.7 8.1 5.2 0 66.5
1897-98 0 0 10.1 18.4 32.1 26.8 1.2 2.4 0 91
1898-99 0 T 10.6 27 16.6 16.3 21.2 4.3 T 96
1899-00 T T 1.3 21.5 24.7 28.5 54 1.3 0 131.3
1900-01 0 0 17 20.3 29.8 36.9 13.7 23.8 T 141.5
1901-02 0 0.1 14.1 14.5 23.8 23 1.2 2.3 T 79
1902-03 0 0.1 4.1 27.7 18.1 15.6 2.4 0.3 0 68.3
1903-04 0 0.6 4.4 16.1 27.2 17.2 10.7 19.5 T 95.7
1904-05 0 0.2 2.1 15.8 27.5 15.2 7 0.5 0 68.3
1905-06 0 T 4 8.4 7.6 8 15.2 1.1 0 44.3
1906-07 0 5 5.7 18.7 11.7 15.7 3.1 2.5 1.3 63.7
1907-08 0 0 2.2 11.6 16.5 19.8 7.9 6.3 3 67.3
1908-09 0 0.5 4.6 10 22.5 6.1 9.7 9.8 3.3 66.5
1909-10 0 T 1.7 14.6 22 42.7 3.4 0.5 0 84.9
1910-11 0 2.2 15.7 29.8 9.5 30 13.5 4.7 2 107.4
1911-12 0 0 6.5 7.5 21.5 10.8 8.8 6.9 T 62
1912-13 0 0 7.2 6.9 10 18.6 15.2 1.3 0 59.2
1913-14 0 0.2 0.3 14.4 15.1 21.6 27.9 7.2 0 86.7
1914-15 0 0.8 4.7 16.1 22.9 9.8 6 0.5 0 60.8
1915-16 0 0 3.4 14.8 8.5 35.7 43.8 0.7 0 106.9
1916-17 0 0 11.7 24.9 22.7 16.7 14.6 2.3 T 92.9
1917-18 0 T 7.9 29.7 17.2 12.7 10.5 1.3 0 79.3
run;
I want to write a rolling mean code of m_tax using Python 2.7 pandas to analysis the time series data from the web page (http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm).
datum m_ta m_tax m_taxd m_tan m_tand
------- ----- ----- ---------- ----- ----------
1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10
1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15
1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01
1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23
1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05
1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17
1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04
1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29
....
Here I tried my code as:
pd.rolling_mean(df.resample("1M", fill_method="ffill"), window=60, min_periods=1, center=True).mean()
and I got result:
m_ta 11.029173
m_tax 17.104283
m_tan 4.848637
month 6.499500
monthly_mean 11.030405
monthly_std 1.836159
m_tax% 0.083348
m_tan% 0.023627
dtype: float64
In another way I tried as:
s = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/1900', periods=1000))
s = s.cumsum()
r = s.rolling(window=60)
r.mean()
and I got result
1900-01-01 NaN
1900-01-02 NaN
1900-01-03 NaN
1900-01-04 NaN
1900-01-05 NaN
1900-01-06 NaN
1900-01-07 NaN
1900-01-08 NaN
...
So I am confused here. Which one should I use? Could someone please give me idea? Thanks!
Starting with version 0.18.0, both rolling() and resample() are methods that behave similarly to groupby() and are deprecated as functions.
What's new in pandas version 0.18.0
rolling()/expanding() in pandas version 0.18.0
resample() in pandas version 0.18.0
I can't tell exactly what your desired results are, but maybe something like this is what you want? (And you can see the warning message below, although I'm not sure what triggers it here.)
>>> df
m_ta m_tax m_taxd m_tan m_tand
datum
1901-01-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10
1901-02-01 -2.1 3.5 1901-02-06 -7.9 1901-02-15
1901-03-01 5.8 13.5 1901-03-20 0.6 1901-03-01
1901-04-01 11.6 18.2 1901-04-10 7.4 1901-04-23
1901-05-01 16.8 22.5 1901-05-31 12.2 1901-05-05
1901-06-01 21.0 24.8 1901-06-03 14.6 1901-06-17
1901-07-01 22.4 27.4 1901-07-30 16.9 1901-07-04
1901-08-01 20.7 25.9 1901-08-01 14.7 1901-08-29
>>> df.resample("1M").rolling(3,center=True,min_periods=1).mean()
/Users/john/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:1: FutureWarning: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
if __name__ == '__main__':
m_ta m_tax m_tan
datum
1901-01-31 -3.400000 4.250000 -10.050000
1901-02-28 -0.333333 7.333333 -6.500000
1901-03-31 5.100000 11.733333 0.033333
1901-04-30 11.400000 18.066667 6.733333
1901-05-31 16.466667 21.833333 11.400000
1901-06-30 20.066667 24.900000 14.566667
1901-07-31 21.366667 26.033333 15.400000
1901-08-31 21.550000 26.650000 15.800000
I am trying to select the following data using pandas for Python 2.7 from the web page (http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm) starting from the year 1991 to 2000. somebody please can help me how I can write the code. Thanks!
datum m_ta m_tax m_taxd m_tan m_tand
------- ----- ----- ---------- ----- ----------
1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10
1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15
1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01
1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23
1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05
1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17
1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04
1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29
1901-09 15.9 19.9 1901-09-01 11.8 1901-09-09
1901-10 12.6 17.9 1901-10-04 8.3 1901-10-31
1901-11 4.7 11.1 1901-11-14 -0.2 1901-11-26
1901-12 4.2 8.4 1901-12-22 -1.4 1901-12-07
1902-01 3.4 7.5 1902-01-25 -2.2 1902-01-15
1902-02 2.8 6.6 1902-02-09 -2.8 1902-02-06
1902-03 5.3 13.3 1902-03-22 -3.5 1902-03-13
1902-04 10.5 15.8 1902-04-21 6.1 1902-04-08
1902-05 12.5 20.6 1902-05-31 8.5 1902-05-10
1902-06 18.5 23.8 1902-06-30 14.4 1902-06-19
....
You can use df.year with boolean indexing for selecting data by column datum:
#convert column datum to period
df['datum'] = pd.to_datetime(df['datum']).dt.to_period('M')
#convert columns to datetime
df['m_taxd'] = pd.to_datetime(df['m_taxd'])
df['m_tand'] = pd.to_datetime(df['m_tand'])
print df.datum.dt.year
0 1901
1 1901
2 1901
3 1901
4 1901
5 1901
6 1901
7 1901
8 1901
9 1901
10 1901
11 1901
12 1902
13 1902
14 1902
15 1902
16 1902
17 1902
Name: datum, dtype: int64
#change 1901 to 2000
print df[df.datum.dt.year <= 1901]
datum m_ta m_tax m_taxd m_tan m_tand
0 1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10
1 1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15
2 1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01
3 1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23
4 1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05
5 1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17
6 1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04
7 1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29
8 1901-09 15.9 19.9 1901-09-01 11.8 1901-09-09
9 1901-10 12.6 17.9 1901-10-04 8.3 1901-10-31
10 1901-11 4.7 11.1 1901-11-14 -0.2 1901-11-26
11 1901-12 4.2 8.4 1901-12-22 -1.4 1901-12-07
One:
data have;
input x1 x2;
diff=x1-x2;
a_diff= round(abs(diff), .01);
* a_diff=abs(diff);
cards;
50.7 60
3.3 3.3
28.8 30
46.2 43.2
1.2 2.2
25.5 27.5
2.9 4.9
5.4 5
3.8 3.2
1 4
;
run;
proc rank data =have out =have_r;
where diff;
var a_diff ;
ranks a_diff_r;
run;
proc print data =have_r;run;
Results:
Obs x1 x2 diff a_diff a_diff_r
1 50.7 60.0 -9.3 9.3 9.0
2 28.8 30.0 -1.2 1.2 4.0
3 46.2 43.2 3.0 3.0 7.5
4 1.2 2.2 -1.0 1.0 3.0
5 25.5 27.5 -2.0 2.0 5.5
6 2.9 4.9 -2.0 2.0 5.5
7 5.4 5.0 0.4 0.4 1.0
8 3.8 3.2 0.6 0.6 2.0
9 1.0 4.0 -3.0 3.0 7.5
Two:
data have;
input x1 x2;
diff=x1-x2;
a_diff=abs(diff);
cards;
50.7 60
3.3 3.3
28.8 30
46.2 43.2
1.2 2.2
25.5 27.5
2.9 4.9
5.4 5
3.8 3.2
1 4
;
run;
proc rank data =have out =have_r;
where diff;
var a_diff ;
ranks a_diff_r;
run;
proc print data =have_r;run;
results:
Obs x1 x2 diff a_diff a_diff_r
1 50.7 60.0 -9.3 9.3 9.0
2 28.8 30.0 -1.2 1.2 4.0
3 46.2 43.2 3.0 3.0 7.5
4 1.2 2.2 -1.0 1.0 3.0
5 25.5 27.5 -2.0 2.0 5.0
6 2.9 4.9 -2.0 2.0 6.0
7 5.4 5.0 0.4 0.4 1.0
8 3.8 3.2 0.6 0.6 2.0
9 1.0 4.0 -3.0 3.0 7.5
Attention Please,Obs 3,9,5,6, why ranks were different? Thank you!
Run the code below and you'll see that they are actually different. That's because of inaccuracies in numeric storage; similar to how 1/3 is not representable in decimal notation (0.333333333333333 etc.) and 1-(1/3)-(1/3)-(1/3) is not equal to zero if you use, say, ten digits to store each result as you go (it is equal to 0.000000001, then), any computer system will have some issues with certain numbers that while in decimal (base 10) appear to store nicely, in binary do not.
The solution here is basically to round as you are, or to fuzz the result which amounts to the same thing (it ignores differences less than 1x10^-12).
data have;
input x1 x2;
diff=x1-x2;
a_diff=abs(diff);
put a_diff= hex16.;
cards;
50.7 60
3.3 3.3
28.8 30
46.2 43.2
1.2 2.2
25.5 27.5
2.9 4.9
5.4 5
3.8 3.2
1 4
;
run;
I want to write Python code to analyze the percentage of m_tax and m_tan for Python 2.7 from the web page (http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm). I have already the dataframe code, but I couldn't write percentage code. Could somebody please help me how I can write the code. Thanks!
datum m_ta m_tax m_taxd m_tan m_tand
------- ----- ----- ---------- ----- ----------
1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10
1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15
1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01
1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23
1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05
1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17
1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04
1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29
You can call div and pass the sum of the columns to add % columns:
In [66]:
df['m_tax%'],df['m_tan%'] = df['m_tax'].div(df['m_tax'].sum()) * 100, df['m_tan'].div(df['m_tax'].sum()) * 100
df
Out[66]:
datum m_ta m_tax m_taxd m_tan m_tand m_tax% m_tan%
0 1901-01 -4.7 5.0 1901-01-23 -12.2 1901-01-10 3.551136 -8.664773
1 1901-02 -2.1 3.5 1901-02-06 -7.9 1901-02-15 2.485795 -5.610795
2 1901-03 5.8 13.5 1901-03-20 0.6 1901-03-01 9.588068 0.426136
3 1901-04 11.6 18.2 1901-04-10 7.4 1901-04-23 12.926136 5.255682
4 1901-05 16.8 22.5 1901-05-31 12.2 1901-05-05 15.980114 8.664773
5 1901-06 21.0 24.8 1901-06-03 14.6 1901-06-17 17.613636 10.369318
6 1901-07 22.4 27.4 1901-07-30 16.9 1901-07-04 19.460227 12.002841
7 1901-08 20.7 25.9 1901-08-01 14.7 1901-08-29 18.394886 10.440341