How to show sas data set using group by in report - sas

I'm working in SAS EGRC 6.1 module and I've a data set as below in work library
SAGIA_Detail
BranchCode BranchName RegionCode SagiaLicNo IssuingDate ExpiryDate
20 Abc Central 1 1/1/2000 1/1/2001
20 Abc Central 2 1/1/2000 1/1/2001
10 def East 3 1/1/2000 1/1/2001
BranchManager IsIssuance IssuanceFees IsRenewal RenewalFees Total
name Yes 100 No 0 100
name Yes 100 Yes 100 200
name Yes 200 Yes 100 300
I want to print this data set in my SASStoredProcess report with group by of BranchName or by BranchName.
I wrote this code but but its only printing the data lines of my data set without any total or grandtotal row of html table.
In short it's not executing any code after the end statement of do loop. Please help me to figure out that where I did wrong in my coding.
data _null_;
file _webout;
put '<html><body><table>';
do until(last.region);
set SAGIA_Detail nobs=nobs end=eof;
by BranchCode RegionCode BranchName;
if first.BranchCode then put
'<tr><th colspan="9"><span><b><u>BranchCode</u></b>: ' BranchCode '</span><span><b><u>BranchName</u></b>: ' BranchName '</span><span><b><u>RegionCode</u></b>: ' RegionCode '</sapn></th></tr>
<tr class=Head>
<th>Sagia LicenseNo</th>
<th>Issue Date</th>
<th>Expiry Date</th>
<th>Manager Name</th>
<th>Is Issuance</th>
<th>Issuance Fees</th>
<th>Is Renewal</th>
<th>Renewal Fees</th>
<th>Total</th>
</tr>';
put '<tr>
<td>';put SagiaLicNo; put '</td>
<td>';put IssuingDate date10.; put '</td>
<td>';put ExpiryDate date10.; put '</td>
<td>';put BranchManager; put '</td>
<td>';put IsIssuance; put '</td>
<td>';put IssuanceFees comma12.2; put '</td>
<td>';put IsRenewal; put '</td>
<td>';put RenewalFees comma12.2; put '</td>
<td>';put Total comma12.2; put '</td>
</tr>';
Total_IssuanceFees = sum(Total_IssuanceFees,IssuanceFees);
Total_RenewalFees = sum(Total_RenewalFees,RenewalFees);
Total_TotalFees = sum(Total_TotalFees,Total);
end;
put '<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>';
put '<tr class=Grand>
<td>Grand</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>' Total_IssuanceFees '</td>
<td></td>
<td>' Total_RenewalFees '</td>
<td>' Total_TotalFees '</td>
</tr>';
Issuancegrand+Total_IssuanceFees;
Renewalgrand+Total_RenewalFees;
Totalgrand+Total_TotalFees;
if eof then put
'<tr class=Grand>
<td colspan="6">GrandTotal</td>'
'<td>' Issuancegrand '</td>'
'<td>' Renewalgrand '</td>'
'<td>' Totalgrand '</td></tr>'
'</table>
</body>
</html>';
run;

First, you are using a BY clause, make sure your data is sorted.
proc sort data=SAGIA_Detail;
by BranchCode RegionCode BranchName;
run;
Second, your sample data has the variable BranchRegion but your code uses RegionCode. Are these the same? Make sure your code is using a variable that actually exists.
Further, your data step references last.region. There is no region variable.
Fixing those should get you closer to what you are looking for.

Related

SAS ODS RTF & proc report

I'm trying to figure out why the formatting of my reports suddenly changed.
I had been exporting a report with 8 columns, all which fit on a single page in a word document. This month, it decided to put a page break in the middle (so the first 4 columns are on the first page, the second 4 columns are on the second page). Code didn't change at all.
So, thoughts:
data in the columns got bigger, so the width is no longer able to fit on one page. I thought the ods options "keepn" or "trkeep" would solve these problems, but neither made any difference.
some other sas setting made the column width defaults change. So I put "width=10" on the define statements within the proc report. Again, nothing changed.
My Code:
ODS RTF FILE="&REPORT_LOC./&outfile..rtf" keepn trkeep;
proc report data=FinalRpt2 nowindows headline headskip spacing=2 missing split='*';
column storeNum dept sales;
define storeNum / order width=10 'Store Number';
define dept / order width=10 'Department';
define sales / display width=10 'Sales';
run;

BeautifulSoup How do I extract data from specific columns from HTML table. My code is extracting all the columns

I have a HTML table with a row and some columns. I would like to extract the data from the column which has the text "Total" and the data from the column which has the value "93"
Just these 2 columns I would like to extract the data. My code is extracting the data from all of the columns.
E.g. My output is:
Total
93
93
0
0
My desired output would be:
Total 93
My code is:
def extract_total_from_report_htmltestrunner():
filename = (
r"C:\test_runners 2 edit project\selenium_regression_test\TestReport\ClearCore_Automated_GUI_Regression_TestReport.html")
html_report_part = open(filename, 'r')
soup = BeautifulSoup(html_report_part, "html.parser")
tr_total_row = soup.find('tr', {'id': 'total_row'})
tr_total_row.find(text=True, recursive=False)
print tr_total_row.text
return tr_total_row.text
The HTML snippet is:
<table id='result_table'>
<tr id='total_row'>
<td>Total</td>
<td>93</td>
<td>93</td>
<td>0</td>
<td>0</td>
<td> </td>
</tr>
</table>
How do extract "Total" "93" and print it out in the same line?
Thanks, Riaz
You can use find_all() and slice the results:
" ".join(td.get_text(strip=True) for td in tr_total_row.find_all("td")[:2])

using cfswitch cfcase to display every 3 months

I am working on a tax project. Taxes are broken down into quarters. the months the taxes are run are March, June, Sept, and December. once run my website displays when the taxes will run again. my problem is that in my results page when the next run date is December instead of displaying 12-2012 i get something that looks like 0-2012.
Here is my code:
<td style="white-space: nowrap;"> #stec_mysql_search_results.cover_date# </td>
<td style="white-space: nowrap;"> <cfif "" neq stec_mysql_search_results.next_run>0<cfset temp_next_run = stec_mysql_search_results.next_run MOD 4><cfswitch expression="#temp_next_run#">
<cfcase value="1">3</cfcase>
<cfcase value="2">6</cfcase>
<cfcase value="3">9</cfcase>
<cfcase value="4">12</cfcase>
</cfswitch>-<cfif 4 lt stec_mysql_search_results.next_run>#year(now())+1#<cfelse>#year(now())#</cfif></cfif> </td>
Here is the output when you view source:
<td style="white-space: nowrap;"> 07-16-2012 </td>
<td style="white-space: nowrap;"> 0-2012 </td>
The key to the problem is your code is expecting 12 mod 4 to give 4, when it gives 0.
The code you've provided has been formatted with hardly any line-breaks in, which is a stupid way of writing code because it makes it very difficult to maintain (in terms of readability, modification, and even simple revision comparisons), especially when later developers have to come along and understand what's going on.
Make sure you use newlines - particularly if that means fixing code written by others. If the output of whitespace is an issue then the ideal solution is generally to put the logic in a function (and use output=false), though you can also use <cfsilent>..</cfsilent> blocks, appropriately placed comments <!--- --->, and other means.
Here is the relevant part of your code translated to something actually readable:
<cfif "" neq stec_mysql_search_results.next_run>
0
<cfset temp_next_run = stec_mysql_search_results.next_run MOD 4>
<cfswitch expression="#temp_next_run#">
<cfcase value="1">3</cfcase>
<cfcase value="2">6</cfcase>
<cfcase value="3">9</cfcase>
<cfcase value="4">12</cfcase>
</cfswitch>
-
<cfif 4 lt stec_mysql_search_results.next_run>
#year(now())+1#
<cfelse>
#year(now())#
</cfif>
</cfif>
The 0 you are seeing in your results is the hard-coded one just inside the cfif.
Because the switch doesn't have a case for 0 it is not outputting anything.
To make the existing code work, just change the cfcase for 4 to 0.
However, since this is dealing with quarters, I don't think you're calculating what you mean to be.
Here is what simply changing the cfcase from 4 to 0 would result in...
January = January
February = February
March = March
April = December
May = January
June = February
July = March
August = December
September = January
October = February
November = March
December = December
When what you probably want is this:
January = March
February = March
March = March
April = June
May = June
June = June
July = September
August = September
September = September
October = December
November = December
December = December
Which can be done really simply with 3*ceiling(next_run/3).
If this assumption is correct, there's a significantly better way to write your code:
<td>#calculateNextRunQuarter(stec_mysql_search_results.next_run)#</td>
<cffunction name="calculateNextRunQuarter()" returntype="String" output=false>
<cfargument name="NextRunMonth" type="Numeric" required />
<cfset var Quarter = 3*ceiling(Arguments.NextRunMonth/3) />
<cfset var TheYear = Year(Now()) />
<cfif Arguments.NextRunMonth GTE 4 >
<cfset TheYear = TheYear + 1 />
</cfif>
<cfreturn Right('0'&Quarter,2) & '-' & TheYear />
</cffunction>
And because the logic is all inside a function with output=false there's no stray whitespace and the code is still perfectly readable
cfcase will accept a list, maybe you're over complicating it, why not do this:
<cfswitch expression="#stec_mysql_search_results.next_run#">
<cfcase value="1,2,3">3</cfcase>
<cfcase value="4,5,6">6</cfcase>
<cfcase value="7,8,9">9</cfcase>
<cfcase value="10,11,12">12</cfcase>
</cfswitch>

Wrestling with PROC REPORT and summary lines

I am having trouble getting proc report to do quite what I want.
I have a table with state, item, counts, percentage by state and percentage of total. There are summary lines giving the total by state and a grand total. My problem is that those summary lines summarize the state totals at the grand total level. like so:
CODE:
proc report data=dataset nowd ;
columns state item count pct_state percent;
define state /order 'State';
define item / 'Status';
define count / '#';
define pct_state / '% of State';
define percent / '% of Total';
break after state/ol summarize;
compute after state;
item=catt(state,' Total');
state = '';
line #1 ' ';
endcomp;
rbreak after /ol summarize;
compute after;
involved = 'Grand Total';
endcomp;
run;
Makes a table like this:
State Item # %state %total
AL A 2 40.0% 20.0%
B 3 60.0% 30.0%
AL Total 5 100.0% 50.0%
MN A 1 20.0% 10.0%
B 1 20.0% 10.0%
C 3 60.0% 30.0%
MN Total 5 100.0% 50.0%
Grand Total 10 200.0% 100.0%
As you can see, it reports the state % total as 200% which is a nonsensical number. I would prefer to have it not summarize the state value at all. I know that the sas website warns about using dates on tables with summary lines since SAS interprets them as numerical variables and thus summarizes them...but it doesn't provide a good solution. I really don't understand why the BREAK and RBREAK statements don't have a "VAR" option that lets you specify...but now I need a workaround.
What I have come up with is to make a new variable and store the percentage as text so that it can't be computed in the summary but this is a really backwards way to do it.
data dataset; set dataset;
state_txt = trim(left(put(pct_state,percent10.1)));
run;
proc report data=dataset nowd ;
columns state item count state_txt percent;
define state /order 'State';
define item / 'Status';
define count / '#';
define state_txt / right '% of State';
define percent / '% of Total';
break after state/ol summarize;
compute after state;
item=catt(state,' Total');
state = '';
line #1 ' ';
endcomp;
rbreak after /ol summarize;
compute after;
involved = 'Grand Total';
endcomp;
run;
This eliminates all of the summaries (since it is a character variable) but it seems like just a terrible way of doing things when I should be able to say something like rbreak after /summarize var=count percent; and be done with it. Is there any better way to do it? Also, I wouldn't mind if it summarized on the per state level to 100%...its not a priority though and is far less important than getting it to NOT say 200% on the bottom (or in the case of a full USA table, 5000%).
Sample data:
data dataset;
length state item $50;
infile datalines delimiter=',';
input state item $ count percent pct_state;
datalines;
AL,A,8,0.0047,1.0000
DC,A,1,0.0006,0.5000
DC,B,1,0.0006,0.5000
FL,A,18,0.0107,0.7500
FL,B,2,0.0012,0.0833
FL,C,4,0.0024,0.1667
LA,A,434,0.2576,0.8314
LA,B,69,0.0409,0.1322
LA,C,19,0.0113,0.0364
MI,A,1,0.0006,1.0000
MS,A,4,0.0024,0.8000
MS,B,1,0.0006,0.2000
OK,A,2,0.0012,1.0000
PA,A,1,0.0006,1.0000
TX,A,943,0.5596,0.8435
TX,B,132,0.0783,0.1181
TX,C,43,0.0255,0.0385
VA,A,1,0.0006,1.0000
WI,B,1,0.0006,1.0000
;
I think using some if logic in your COMPUTE AFTER will do the trick.
Try this (I changed the data slighty, let me know if this doesn't represent your data):
(Left in the out= statement, which can be helpful)
data dataset;
length state item $50;
infile datalines delimiter=',';
input state item $ count percent pct_state;
format percent pct_state percent10.1;
datalines;
AL,A,8,0.8,1.0000
DC,A,1,0.1,0.5000
DC,B,1,0.1,0.5000
;
proc report data=dataset nowd out=work.report;
columns state item count pct_state percent;
define state /order 'State';
define item / 'Status';
define count / '#';
define pct_state / '% of State';
define percent / '% of Total';
break after state/ol summarize;
compute after state;
item=catt(state,' Total');
state = '';
line #1 ' ';
endcomp;
rbreak after /ol summarize;
compute after;
State = 'Grand Total';
if pct_state.sum>1 then pct_state.sum=1;
endcomp;
run;

Combine several variables of different data types into one string

I'd like to combine 6 variables that are of different types in my view...I thought the concatenation '+' will work but I get an error when I use it. I'd like the end result to be like this:
var1 var2 var3 var4 (var5 var6)
How do I go about this?
I'm assuming the problem you had with using + is that not all of the variables are strings. Use the following but replace %s with an applicable formatter from this list.
output_string = "%s %s %s %s (%s %s)" % (var1, var2, var3, var4, var5, var6)
Another option is to pass them to your template and output them there:
{{ var1 }} {{ var2 }} {{ var3 }} {{ var4 }} ({{ var5 }} {{ var6 }})