Scraping via PROC HTTP

Scraping via PROC HTTP - sas

I wanted to use proc http to scrape quotes off Yahoo finance. It did not produce the HTML in the out file, but when I used debug level = 3 to figure what happened, the full HTML was given in the log. What happened? Clearly, I want the HTML in the out file, but the alternative of saving the log as a text will be sufficient as well. How can I do that?
filename Testing "&Folder&OutFile";
%let YahooFin = "https://finance.yahoo.com/quote/SNAP/";
proc http
url = &YahooFin.
out = Testing
method = "get";
debug level = 3;
run;
EDIT
Here are my macro variables. Just to clarify, I did receive a HTML from my original script, but it was just the top portion, not the whole thing. See the enclosed image.
%let Folder = /06specialty/Practice Scripts/;
%let OutFile = TestSNAP.txt;
Sample Image

The HTML that link generates has some really long lines (541,567 bytes).
Make sure to tell SAS to use a long enough LRECL if you are going to try to read it with a DATA step.
1149 proc http
1150 url = &YahooFin.
1151 out = Testing
1152 method = "get";
1153 run;
NOTE: PROCEDURE HTTP used (Total process time):
real time 0.98 seconds
cpu time 0.04 seconds
NOTE: 200 OK
1154
1155 data _null_;
1156 infile testing lrecl=1000000 length=ll ;
1157 input;
1158 run;
NOTE: The infile TESTING is:
Filename=XXXX\#LN00123,
RECFM=V,LRECL=1000000,File Size (bytes)=777076,
Last Modified=21Mar2021:15:23:01,
Create Time=21Mar2021:15:20:45
NOTE: 116 records were read from the infile TESTING.
The minimum record length was 0.
The maximum record length was 541567.
NOTE: DATA statement used (Total process time):

What does your OutFile macro-variable resolve to? What is the extension of your output file?
I tried with a .txt file:
filename Testing "/path/to/want.txt";
%let YahooFin = "https://finance.yahoo.com/quote/SNAP/";
proc http
url = &YahooFin.
out = Testing
method = "get";
run;
And it worked:

I would suggest to use Yahoo Graph API to get quotes or historical price for a stock
/*Lets fetch Historic data from Microsoft*/
%let stock=MSFT;
filename test "%sysfunc(getoption(WORK))\&stock..json";
proc http
url="https://query1.finance.yahoo.com/v7/finance/chart/&stock?range=1mo%str(&)interval=1d%str(&)indicators=quote%str(&)includeTimestamps=true%str(&)includeTimestamps=true%str(&)crumb=&getCrumb."
out=test;
*debug level=3;
quit;
LIBNAME stock JSON "%sysfunc(getoption(WORK))\&stock..json";
You can get more details from https://myweirdcodes.com/stocks-fetching-historical-stock-data-with-sas

Related

SAS Macro variable escaping apostrophe in variable name Proc Http

I have been working on this for 3 days now and have tried all I can think of including %str(),%bquote(), translate() and tranwrd() to replace single apostrophe with double apostrophe or %’
The below data step and macro work fine until I hit a last name which contains an apostrophe e.g. O’Brien. I then encounter syntax errors due to un closed left parentheses. The below code I have left what I thought was closest to working with the tranwrd included.
Any assistance you can provide is greatly appreciated.
%macro put_data (object1,id);
Proc http
method=“put”
url=“https://myurl/submissionid/&id”
in=&object1;
Headers “content-type”=“application/json”;
Run;
%mend;
data _null_;
Set work.emp_lis;
call execute(catt(‘%put_data(‘,’%quote(‘’{“data”:{“employeeName”:”’,tranwrd(employeeName,”’’”,”’”),’”}}’’),’,id,’)’));
run;
Craig

There are a wide potential of problems in constructing or passing a json string in SAS macro. Proc JSON will produce valid json (in a file) from data and that file in turn can be specified as the input to be consumed by your web service.
Example:
data have;
length id 8 name $25;
input id& name&;
datalines;
1 Homer Simpson
2 Ned Flanders
3 Shaun O'Connor
4 Tom&Bob
5 'X Æ A-12'
6 Goofy "McDuck"
;
%macro put_data (data,id);
filename jsonfile temp;
proc json out=jsonfile;
export &data.(where=(id=&id));
write values "data";
write open object;
write values "name" name;
write close;
run;
proc http
method="put"
url="https://myurl/submissionid/&id"
in=jsonfile
;
headers "content-type"="application/json";
run;
%mend;
data _null_;
set have;
call execute(cats('%nrstr(%put_data(have,',id,'))'));
run;

I was able to find issue with my code with the tranwrd statement was backwards and it needed to be moved to the proc sql create table statement. I also needed to wrap &object1 in %bquote. This was the final code that worked.
When creating table wrap variables in tranwrd as below.
tranwrd(employeeName, “‘“,”’’”)
% macro put_data (object1,id);
Proc http
method=“put”
url=“https://myurl/submissionid/&id”
in=%bquote(&object1);
Headers “content-type”=“application/json”;
Run;
%mend;
data _null_;
Set work.emp_lis;
call execute(catt(‘%put_data(‘,’%quote(‘’{“data”:{“employeeName”:”’,employeeName,’”}}’’),’,id,’)’));
run;

Just use actual quotes and you won't have to worry about macro quoting at all.
So if your macro looks like this:
%macro put_data(object1,id);
proc http method="put"
url="https://myurl/submissionid/&id"
in=&object1
;
headers "content-type"="application/json";
run;
%mend;
Then the value of OBJECT1 would usually be a quoted string literal or a fileref. (There are actually other forms.) Looks like you are trying to generate a quoted string. So just use the QUOTE() function.
So if your data looks like:
data emp_lis;
input id employeeName $50.;
cards;
1234 O'Brien
4567 Smith
;
Then you can use a data step like this to generate one macro call for each observation.
data _null_;
set emp_lis;
call execute(cats
('%nrstr(%put_data)('
,quote(cats('{"data":{"employeeName":',quote(trim(employeeName)),'}}'))
,',',id
,')'
));
run;
And your SAS log will look something like:
NOTE: CALL EXECUTE generated line.
1 + %put_data("{""data"":{""employeeName"":""O'Brien""}}",1234)
NOTE: PROCEDURE HTTP used (Total process time):
real time 2.46 seconds
cpu time 0.04 seconds
2 + %put_data("{""data"":{""employeeName"":""Smith""}}",4567)
NOTE: PROCEDURE HTTP used (Total process time):
real time 2.46 seconds
cpu time 0.04 seconds

How to download a file from web and assign to the certain folder by using SAS

Good Morning
So I have tried to download zip file from the website, and try to assign the location.
The location I want to put is
S:\Projects\
Method1,
First Attempt is below
DATA _null_ ;
x 'start https://yehonal.github.io/DownGit/#/home?url=https:%2F%2Fgithub.com%2FCSSEGISandData%2FCOVID-19%2Ftree%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_daily_reports';
RUN ;
Method1, I can download the file, but this automatically downloaded to my Download folder.
Method 2,
so I found out this way.
filename out "S:\Projects\csse_covid_19_daily_reports.zip";
proc http
url='https://yehonal.github.io/DownGit/#/home?url=https:%2F%2Fgithub.com%2FCSSEGISandData%2FCOVID-19%2Ftree%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_daily_reports'
method="get" out=out;
run;
But the code is not working, not downloading anything.
how can I download the file from the web and assign to the certain location?

I would probably recommend a macro in this case then (or CALL EXECUTE) but I prefer macros and then calling the macro via CALL EXECUTE. Took about a minute running on SAS Academics on Demand (free cloud service).
*set start date for files;
%let start_date = 01-22-2020;
*macro to import data;
%macro importFullData(date);
*file name reference;
filename out "/home/fkhurshed/WANT/&date..csv";
*file to download;
%let download_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/&date..csv";
proc http url=&download_url
method="get" out=out;
run;
*You can add in data import/append steps here as well as necessary;
%mend;
%importFullData(&start_date.);
data importAll;
start_date=input("&start_date", mmddyy10.);
*runs up to previous day;
end_date=today() - 1;
do date=start_date to end_date;
formatted_date=put(date, mmddyyd10.);
str=catt('%importFullData(', formatted_date, ');');
call execute(str);
end;
run;

The url when viewed in a browser is using javascript in the browser to construct a zip file that is automatically downloaded. Proc HTTP does not run javascript, so will not be able to download the ultimate response which is the constructed zip file, thus you get the 404 message.
The list of the files in the repository can be obtain as json from url
https://api.github.com/repos/CSSEGISandData/COVID-19/contents/csse_covid_19_data/csse_covid_19_daily_reports
The listing data contains the download_url for each csv file.
A download_url will look like
https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports/01-22-2020.csv
You can download individual files with SAS per #Reeza, or
use git commands or SAS git* functions to download the repository
AFAIK git archive for downloading only a specific subfolder of a repository is not available surfaced by github server
use svn commands to download a specific folder from a git repository
requires svn be installed (https://subversion.apache.org/) I used SlikSvn
Example:
Make some series plots of a response by date from stacked imported downloaded data.
options noxwait xsync xmin source;
* use svn to download all files in a subfolder of a git repository;
* local folder for storing data from
* COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University;
%let covid_data_root = c:\temp\csse;
%let rc = %sysfunc(dcreate(covid,&covid_data_root));
%let download_path = &covid_data_root\covid;
%let repo_subdir_url = https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports;
%let svn_url = %sysfunc(tranwrd(&repo_subdir_url, tree/master, trunk));
%let os_command = svn checkout &svn_url "&download_path";
/*
* uncomment this block to download the (data) files from the repository subfolder;
%sysexec %superq(os_command);
*/
* codegen and execute the PROC IMPORT steps needed to read each csv file downloaded;
libname covid "&covid_data_root.\covid";
filename csvlist pipe "dir /b ""&download_path""";
data _null_;
infile csvlist length=l;
length filename $200;
input filename $varying. l;
if lowcase(scan(filename,-1,'.')) = 'csv';
out = 'covid.day_'||translate(scan(filename,1,'.'),'_','-');
/*
* NOTE: Starting 08/11/2020 FIPS data first starts appearing after a few hundred rows.
* Thus the high GuessingRows
*/
template =
'PROC IMPORT file="#path#\#filename#" replace out=#out# dbms=csv; ' ||
'GuessingRows = 5000;' ||
'run;';
source_code = tranwrd (template, "#path#", "&download_path");
source_code = tranwrd (source_code, "#filename#", trim(filename));
source_code = tranwrd (source_code, "#out#", trim(out));
/* uncomment this line to import each data file */
*call execute ('%nrstr(' || trim (source_code) || ')');
run;
* memname is always uppercase;
proc contents noprint data=covid._all_ out=meta(where=(memname like 'DAY_%'));
run;
* compute variable lengths for LENGTH statement;
proc sql noprint;
select
catx(' ', name, case when type=2 then '$' else '' end, maxlen)
into
:lengths separated by ' '
from
( select name, min(type) as type, max(length) as maxlen, min(varnum) as minvarnum, max(varnum) as maxvarnum
from meta
group by name
)
order by minvarnum, maxvarnum
;
quit;
* stack all the individual daily data;
data covid.all_days;
attrib date length=8 format=mmddyy10.;
length &lengths;
set covid.day_: indsname=dsname;
date = input(substr(dsname,11),mmddyy10.);
format _character_; * reset length based formats;
informat _character_; * reset length based informats;
run ;
proc sort data=covid.all_days out=us_days;
where country_region = 'US';
by province_state admin2 date;
run;
ods html gpath='.' path='.' file='covid.html';
options nobyline;
proc sgplot data=us_days;
where province_state =: 'Cali';
*where also admin2=: 'O';
by province_state admin2;
title "#byval2 County, #byval1";
series x=date y=confirmed;
xaxis valuesformat=monname3.;
label province_state='State' admin2='County';
label confirmed='Confirmed (cumulative?)';
run;
ods html close;
options byline;
Plots

Unable to import .txt file in SAS using proc IMPORT

My program makes a web-service call and receives a response in XML format which I store as output.txt. When opened in notepad, the file looks like this
<OwnerInquiryResponse xmlns="http://www.fedex.com/esotservice/schema"><ResponseHeader><TimeStamp time="2018-02-01T16:09:19.319Z"/></ResponseHeader><Owner><Employee firstName="Gerald" lastName="Harris" emplnbr="108181"/><SalesAttribute type="Sales"/><Territory NodeGlobalRegion="US" SegDesc="Worldwide Sales" SegNbr="1" TTY="2-2-1-2-1-1-10"/></Owner><Delegates/><AlignmentDetail><SalesAttribute type="Sales"/><Alignments/></AlignmentDetail></OwnerInquiryResponse>
I am unable to read this file into SAS using proc IMPORT. My SAS code is below
proc import datafile="/mktg/prc203/abhee/output.txt" out=work.test2 dbms=dlm replace;
delimiter='<>"=';
getnames=yes;
run;
My log is
1 %_eg_hidenotesandsource;
5 %_eg_hidenotesandsource;
28
29 proc import datafile="/mktg/prc203/abhee/output.txt" out=work.test2 dbms=dlm replace;
30 delimiter='<>"=';
31 getnames=yes;
32 run;
NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to
WORK.PARMS.PARMS.SLIST.
Unable to sample external file, no data in first 5 records.
ERROR: Import unsuccessful. See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 0.09 seconds
cpu time 0.09 seconds
33
34 %_eg_hidenotesandsource;
46
47
48 %_eg_hidenotesandsource;
51
My ultimate goal is to mine Employee first name (Gerald), last name (Harris) and Employee Number (108181) from the above file and store it in the dataset (and then do this over and over again with a loop and upend the same dataset). If you can help regarding importing the entire file or just the information that I need directly, then that would help.

If you only need these three fields then named input a single input statement is perfectly viable, and arguably preferable to parsing xml with regex:
data want;
infile xmlfile dsd dlm = ' /';
input #"Employee" #"firstName=" firstName :$32. #"lastName=" lastName :$32. #"emplnbr=" emplnbr :8.;
run;
This uses the input file constructed in Richard's answer. The initial #Employee is optional but reduces the risk of picking up any fields with the same names as the desired ones that are subfields of a different top-level field.
Bonus: the same approach can also be used to import json files if you're in a similar situation.

Since you are unable to use the preferred methods of reading xml data, and you are processing a single record result from a service query the git'er done approach seems warranted.
One idea that did not pan out was to use named input.
input #'Employee' lastname= firstname= emplnbr=;
The results could not be made to strip the quotes with $QUOTE. informat nor honor infile dlm=' /'
An approach that did work was to read the single line and parse the value out using a regular expression with capture groups. PRXPARSE is used to compile a pattern, PRXMATCH to test for a match and PRXPOSN to retrieve the capture group.
* create a file to read from (represents the file from the service call capture);
options ls=max;
filename xmlfile "%sysfunc(pathname(WORK))\1-service-call-record.xml";
data have;
input;
file xmlfile;
put _infile_;
datalines;
<OwnerInquiryResponse xmlns="http://www.fedex.com/esotservice/schema"><ResponseHeader><TimeStamp time="2018-02-01T16:09:19.319Z"/></ResponseHeader><Owner><Employee firstName="Gerald" lastName="Harris" emplnbr="108181"/><SalesAttribute type="Sales"/><Territory NodeGlobalRegion="US" SegDesc="Worldwide Sales" SegNbr="1" TTY="2-2-1-2-1-1-10"/></Owner><Delegates/><AlignmentDetail><SalesAttribute type="Sales"/><Alignments/></AlignmentDetail></OwnerInquiryResponse>
run;
* read the entire line from the file and parse out the values using Perl regular expression;
data want;
infile xmlfile;
input;
rx_employee = prxparse('/employee\s+firstname="([^"]+)"\s+lastname="([^"]+)"\s+emplnbr="([^"]+)"/i');
if prxmatch(rx_employee,_infile_) then do;
firstname = prxposn(rx_employee, 1, _infile_);
lastname = prxposn(rx_employee, 2, _infile_);
emplnbr = prxposn(rx_employee, 3, _infile_);
end;
keep firstname last emplnbr;
run;

Repeatedly running a data step if URL fails to load

I am creating a dataset using filename URL web submissions. However, in some instaces I keep getting '502' responses from the server. To get around this I would like to use some conditional logic inside a macro. I'm most of the way there, but I cant quite get the end bit to work. The idea is that the macro, which is nested within other nested macros will keep trying this one submission until it gets a dataset that doesn't have 0 observations then move on:
%macro test_exst;
filename loader url "http://finance.yahoo.com/d/quotes.csv?s=&svar1.+&svar2.+&svar3.+&svar4.+&svar5.+&svar6.+&svar7.+&svar8.+&svar9.+&svar10.+
&svar11.+&svar12.+&svar13.+&svar14.+&svar15.+&svar16.+&svar17.+&svar18.+&svar19.+&svar20.+
&svar21.+&svar22.+&svar23.+&svar24.+&svar25.+&svar26.+&svar27.+&svar28.+&svar29.+&svar30.+
&svar31.+&svar32.+&svar33.+&svar34.+&svar35.+&svar36.+&svar37.+&svar38.+&svar39.+&svar40.+
&svar41.+&svar42.+&svar43.+&svar44.+&svar45.+&svar46.+&svar47.+&svar48.+&svar49.+&svar50.+
&svar51.+&svar52.+&svar53.+&svar54.+&svar55.+&svar56.+&svar57.+&svar58.+&svar59.+&svar60.+
&svar61.+&svar62.+&svar63.+&svar64.+&svar65.+&svar66.+&svar67.+&svar68.+&svar69.+&svar70.+
&svar71.+&svar72.+&svar73.+&svar74.+&svar75.+&svar76.+&svar77.+&svar78.+&svar79.+&svar80.+
&svar81.+&svar82.+&svar83.+&svar84.+&svar85.+&svar86.+&svar87.+&svar88.+&svar89.+&svar90.+
&svar91.+&svar92.+&svar93.+&svar94.+&svar95.+&svar96.+&svar97.+&svar98.+&svar99.+&svar100.+
&svar101.+&svar102.+&svar103.+&svar104.+&svar105.+&svar106.+&svar107.+&svar108.+&svar109.+&svar110.+
&svar111.+&svar112.+&svar113.+&svar114.+&svar115.+&svar116.+&svar117.+&svar118.+&svar119.+&svar120.+
&svar121.+&svar122.+&svar123.+&svar124.+&svar125.+&svar126.+&svar127.+&svar128.+&svar129.+&svar130.+
&svar131.+&svar132.+&svar133.+&svar134.+&svar135.+&svar136.+&svar137.+&svar138.+&svar139.+&svar140.+
&svar141.+&svar142.+&svar143.+&svar144.+&svar145.+&svar146.+&svar147.+&svar148.+&svar149.+&svar150.+
&svar151.+&svar152.+&svar153.+&svar154.+&svar155.+&svar156.+&svar157.+&svar158.+&svar159.+&svar160.+
&svar161.+&svar162.+&svar163.+&svar164.+&svar165.+&svar166.+&svar167.+&svar168.+&svar169.+&svar170.+
&svar171.+&svar172.+&svar173.+&svar174.+&svar175.+&svar176.+&svar177.+&svar178.+&svar179.+&svar180.+
&svar181.+&svar182.+&svar183.+&svar184.+&svar185.+&svar186.+&svar187.+&svar188.+&svar189.+&svar190.+
&svar191.+&svar192.+&svar193.+&svar194.+&svar195.+&svar196.+&svar197.+&svar198.+&svar199.+&svar200.
&f=&&fvar&a." DEBUG ;
/* data step based on filename url above goes here, each pass will give 500 metrics x 1 symbol dataset*/
%put create dataset from csv submission;
data temp_&I._&&fvar&a.;
infile loader length=len MISSOVER /*delimiter = ','*/;
/* input record $varying8192. len; */
input record $varying30. len;
format record $30.;
informat record $30.;
run;
data _null_;
dsid=open("temp_&I._&&fvar&a.");
obs=attrn(dsid,"nobs");
put "number of observations = " obs;
if obs = 0 then stop;
else;
filename loader url "http://finance.yahoo.com/d/quotes.csv?s=&svar1.+&svar2.+&svar3.+&svar4.+&svar5.+&svar6.+&svar7.+&svar8.+&svar9.+&svar10.+
&svar11.+&svar12.+&svar13.+&svar14.+&svar15.+&svar16.+&svar17.+&svar18.+&svar19.+&svar20.+
&svar21.+&svar22.+&svar23.+&svar24.+&svar25.+&svar26.+&svar27.+&svar28.+&svar29.+&svar30.+
&svar31.+&svar32.+&svar33.+&svar34.+&svar35.+&svar36.+&svar37.+&svar38.+&svar39.+&svar40.+
&svar41.+&svar42.+&svar43.+&svar44.+&svar45.+&svar46.+&svar47.+&svar48.+&svar49.+&svar50.+
&svar51.+&svar52.+&svar53.+&svar54.+&svar55.+&svar56.+&svar57.+&svar58.+&svar59.+&svar60.+
&svar61.+&svar62.+&svar63.+&svar64.+&svar65.+&svar66.+&svar67.+&svar68.+&svar69.+&svar70.+
&svar71.+&svar72.+&svar73.+&svar74.+&svar75.+&svar76.+&svar77.+&svar78.+&svar79.+&svar80.+
&svar81.+&svar82.+&svar83.+&svar84.+&svar85.+&svar86.+&svar87.+&svar88.+&svar89.+&svar90.+
&svar91.+&svar92.+&svar93.+&svar94.+&svar95.+&svar96.+&svar97.+&svar98.+&svar99.+&svar100.+
&svar101.+&svar102.+&svar103.+&svar104.+&svar105.+&svar106.+&svar107.+&svar108.+&svar109.+&svar110.+
&svar111.+&svar112.+&svar113.+&svar114.+&svar115.+&svar116.+&svar117.+&svar118.+&svar119.+&svar120.+
&svar121.+&svar122.+&svar123.+&svar124.+&svar125.+&svar126.+&svar127.+&svar128.+&svar129.+&svar130.+
&svar131.+&svar132.+&svar133.+&svar134.+&svar135.+&svar136.+&svar137.+&svar138.+&svar139.+&svar140.+
&svar141.+&svar142.+&svar143.+&svar144.+&svar145.+&svar146.+&svar147.+&svar148.+&svar149.+&svar150.+
&svar151.+&svar152.+&svar153.+&svar154.+&svar155.+&svar156.+&svar157.+&svar158.+&svar159.+&svar160.+
&svar161.+&svar162.+&svar163.+&svar164.+&svar165.+&svar166.+&svar167.+&svar168.+&svar169.+&svar170.+
&svar171.+&svar172.+&svar173.+&svar174.+&svar175.+&svar176.+&svar177.+&svar178.+&svar179.+&svar180.+
&svar181.+&svar182.+&svar183.+&svar184.+&svar185.+&svar186.+&svar187.+&svar188.+&svar189.+&svar190.+
&svar191.+&svar192.+&svar193.+&svar194.+&svar195.+&svar196.+&svar197.+&svar198.+&svar199.+&svar200.
&f=&&fvar&a." DEBUG ;
data temp_&I._&&fvar&a.;
infile loader length=len MISSOVER /*delimiter = ','*/;
/* input record $varying8192. len; */
input record $varying30. len;
format record $30.;
informat record $30.;
run;
run;
%mend;
%test_exst;
The idea here is try URL submission, create dataset from it, check number of obs is not zero. If its not end the macro. If it is resubmit the same filename URL then create the dataset from it again. Keep doing this until the server respond, the exit the macro and most on to the rest of the code.
I haven't got as far as running this code in anger yet. I'm guessing the filename URL will work fine, but it is the fact that the code is attempting to create a dataset within a data null step right at the end that is making it fall over. Any ideas?
Thanks

Without getting into the specifics of your project, a good way to approach this generally is with recursion.
%macro test_me(iter);
%let iter=%eval(&iter.+1);
data my_data;
infile myfilename;
input stuff;
call symputx("obscount",_n_);
run;
%if &obscount=1 and &iter. < 10 %then %do;
%put Iteration &iter. failed, trying again;
%test_me(&iter.);
%end;
%mend test_me;
%test_me(0);
It checks to see if it worked, and if it did not work, it calls itself again, with a maximum iteration count to make sure you don't end up in infinite loop land if the server is down or somesuch. You also might put a delay in there if the server has a maximum frequency of calls or any other rules the API requires you to follow.

Download Data From TAQ Using SAS

I am trying to download the entire TAQ database on WRDS using SAS. Folloing is the SAS code given by a person from technical support of WRDS:
%let wrds=wrds.wharton.upenn.edu 4016;
options comamid=TCP remote=WRDS;
signon username=_prompt_;
%macro taq_daily_dataset_list(type=ctm,begyyyymmdd=20100101,endyyyymmdd=20111231) / des="Autogenerated list of needed Daily TAQ datasets";
%let type=%lowcase(&type);
/* Get SAS date values for date range endpoints */
%let begdate = %sysfunc(inputn(&begyyyymmdd,yymmdd8.));
%let enddate = %sysfunc(inputn(&endyyyymmdd,yymmdd8.));
%do d=&begdate %to &enddate /** For each date in the DATE range */;
%let yyyymmdd=%sysfunc(putn(&d,yymmddn8.));
/*If the corresponding dataset exists, add it to the list */
%if %sysfunc(exist(taqmsec.&type._&yyyymmdd)) %then taqmsec.&type._&yyyymmdd;
%end;
%mend;
* using this macro;
data my_output;
set %taq_daily_dataset_list(type=ctm,begyyyymmdd=20100101,endyyyymmdd=20121231) open=defer;
run;
I tried to run this in SAS, but it gave me an erorr "THERE IS NOT A DEFAULT INPUT DATA SET (_LAST_IS_NULL)". I don't know how to use SAS, not even a little. All I want is downloading the database.
Really appreciated if someone could help me out of here.

The code you are running is a SAS/CONNECT session from your computer to a remote server. Once you connect, I'm assuming the libname TAQMSEC is defined on the server. So, my guess is you need to "remote submit" the code (which will create the SAS dataset my_output in the server's WORK library). Then you can use PROC DOWNLOAD to copy it to your local machine:
%let wrds=wrds.wharton.upenn.edu 4016;
options comamid=TCP remote=WRDS;
signon username=_prompt_;
RSUBMIT; /* Execute following on server after logging in */
%macro taq_daily_dataset_list(type=ctm,begyyyymmdd=20100101,endyyyymmdd=20111231) / des="Autogenerated list of needed Daily TAQ datasets";
%let type=%lowcase(&type);
/* Get SAS date values for date range endpoints */
%let begdate = %sysfunc(inputn(&begyyyymmdd,yymmdd8.));
%let enddate = %sysfunc(inputn(&endyyyymmdd,yymmdd8.));
%do d=&begdate %to &enddate /** For each date in the DATE range */;
%let yyyymmdd=%sysfunc(putn(&d,yymmddn8.));
/*If the corresponding dataset exists, add it to the list */
%if %sysfunc(exist(taqmsec.&type._&yyyymmdd)) %then taqmsec.&type._&yyyymmdd;
%end;
%mend;
* using this macro;
data my_output;
set %taq_daily_dataset_list(type=ctm,begyyyymmdd=20100101,endyyyymmdd=20121231) open=defer;
run;
/* Download result to your computer */
proc download data=my_output;
run;
ENDRSUBMIT; /* Signals end of processing on remote server */
Any programming statements that appear between the RSUBMIT and ENDRSUBMIT commands are executed on the remote server. Notice that the macro is created and executed by the remote SAS session.
Remember to use the signoff command to disconnect from the server after you retrieve the data you need.

I don't speak SAS so I can't comment on your code, but I don't recognize "taqmsec" as one of the main files. The Consolidated Trades data is held in files of the form taq.CT_YYYYMMDD and the Consolidated Quotes files are taq.CQ_YYYYMMDD. The first available date for these is 19930104.
Back when I had an account, I wrote some Python scripts to automate the process of downloading data in bulk from WRDS: https://github.com/jbrockmendel/pywrds
The scripts that try to auto-setup SSH keys are untested (please send me a note if you want to help me test/fix them), but the core is well-tested. Assuming you have an account and key-based authentication set up, you can run:
import pywrds
# Download the TAQ Consolidated Trades (TAQ_CT) file for 1993-06-12.
# y = [num_files, num_rows, paramiko_ssh, paramiko_sftp, time_elapsed]
y = pywrds.get_wrds('taq.ct', 1993, 06, 12)
# Loop over all available dates to download in bulk.
# The script is moderately smart about picking up
# unfinished loops where they left off.
# y = [num_files, time_elapsed]
y = pywrds.wrds_loop('taq.ct')
# Find out what the darn names of the available TAQ files are.
# y = [file_list, paramiko_ssh, paramiko_sftp]
y = pywrds.find_wrds('taq')
The files start at a few tens of MB apiece in 1993 and grow to ~1 GB apiece for taq.ct and >5GB for taq.cq. Standard WRDS accounts limit your storage space to 1 GB, so trying to query all of, say, taq.cq_20050401 will put a truncated file in your directory. pywrds.get_wrds breaks up these big queries and loops over smaller files, then recombines them after they have all downloaded.
Caution: wrds_loop also deletes these files from your directory on the server after downloading them. It also runs rm wrds_export*, since all of the SAS files it uploads begin with "wrds_export". Make sure you don't have anything else following the same pattern.
The same commands also work with Compustat (comp.fundq, comp.g_fundq, ...), CRSP (crsp.msf, crsp.dsf, ...), OptionMetrics (optionm.optionm_opprcd1996, optionm.opprcd1997,...), IBES, TFN, ...
# Also works with other WRDS datasets.
# The day, month, and year arguments are optional.
# Get the OptionMetrics pricing file for March 1993
y = pywrds.get_wrds('optionm.opprcd', 1993, 3)
# Get the Compustat Fundamentals Quarterly file for 1997
y = pywrds.get_wrds('comp.fundq', 1997)
# Get the CRSP Monthly Stock File for all available years
y = pywrds.get_wrds('crsp.msf')

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Scraping via PROC HTTP - sas

What does your OutFile macro-variable resolve to? What is the extension of your output file? I tried with a .txt file: filename Testing "/path/to/want.txt"; %let YahooFin = "https://finance.yahoo.com/quote/SNAP/"; proc http url = &YahooFin. out = Testing method = "get"; run; And it worked:

Related

SAS Macro variable escaping apostrophe in variable name Proc Http

How to download a file from web and assign to the certain folder by using SAS

Unable to import .txt file in SAS using proc IMPORT

Repeatedly running a data step if URL fails to load

Download Data From TAQ Using SAS

Categories

Resources