Add a YTD column in SAS - sas

This is my imported table in SAS
enter image description here
I want to create a new column titled YTD that sums the months of the year. The new table should look like this
enter image description here
It would be idea if the code was able to accommodate new months moving forward as well.
I do realize that this data set is not ideally structured, but this is what I have to work with.
Thanks

For the data structure imaged you can create a view that performs a sum of all numeric variables
data want / view=want;
set have;
YTD = sum (of _numeric_);
run;

Related

Summation of rows based on Dropdown selection in Power BI

I have a dropdown in Power BI that contains different project name such as Project One, Two, Three. I have included one formula to bring forecast value which is:
Forecast = Chase * Target%
I have created one measure that calculates forecast. The dataset contains weekly based data for Chase and Target %. For example week 1 (Jan 01-Jan 08) Chase will be 30 and target % as 10 hence the forecast for Week 1 is 3 (30*10%)
When I select one project from dropdown list e.g. "Project One" I see the forecast value populating correctly. Same goes if I select only one project from dropdown list .
The issue arises when I select multiple projects and then the forecast value brings the maximum value instead of bringing summation to the values of all weeks of all projects.
Question: What exactly is causing the issue?
Now I understand your requirement from your comments. You can achieve this through 2 step as explained below-
Step-1: Create a custom column in your data source as below-
row_level_forecast = finetarget[chase]/100.00 * finetarget[target]
Step-2: Create the final Measure as below-
forecast = sum(finetarget[row_level_forecast])
Now, use measure "forecast" in the report. This should give you the desired output.
ISSUE-2: From your comments
If I understand correct, you are talking about a case where you are concern about values in columns I marked red in the below picture-
If I am correct with my understanding, you wants to fill week-3 values for Project-1 with 80/70 and for Project-2 100/90. If this is ok, just follow these following steps.
Step-1: Go to EDIT mode clicking "Transform Data" option and select the table you wants to adjust data.
Step-2: Sort your data first for project_name (ascending) then week (ascending). The output will be also as shown in the above image.
Step-3: Select column "chase" in the table and click Fill>>Down option.
Step-4: Repeat step 3 for column"target" as well.
The final output should be as below. Just move back to main report by clicking "Close and Apply". Data should be now as expected in your report.
When you display the forecast, put it in a grid and add the project column, the week column (e.g. Week 1) and the forecast measure. When you select your multiple projects the grid will show each of those along with the calculated measure. If this does not work, there is something wrong with your measure and you should add your measure calculation script to your question.
The measure should be simple, something like:
Forecast = SUM(YourTable[Chase]) * AVERAGE(YourTable[Target%])

Proc SQL running total

I am building a process in SAS EG and came to a sticking point when I needed a running total. This would be very easy to do in Excel but my table is 22M records long. I have VBA experience but not Proc SQL. Can someone show me how to do a running total of dollars by item? The data is sorted by Market/Segment/Item/Month.
Thanks
Jeff
MyData
You hierarchy is Market / Segment / Item, and maybe from the question one can presume an Item is unique across all Markets and Segments.
A running total is easiest in a DATA Step. You will want to use first. automatic variables that are prepared when the step has a BY statement.
data want;
set have;
by Market Segment Item Month; * add month to make sure incoming data is ordered timewise, if not an error will appear in the log;
if first.Item then RunningDollars = 0;
RunningDollars + Dollars; * The + syntax here is a `SUM` statement that causes the RunningDollars variable to be automatically retaine, meaning the value is available for the next record.
run;

Percent split with where condition in SAS

I am new to SAS and data analytics in general. So sorry if my question sound too dumb.
I have a dataset of brand medicine with three variables. Variable 1 contains the drug name, variable two contains whether that drug is BRANDED, Generic or Brand-Generic and variable 3 contains the total sale of that drug.
What I want is percent split the BRANDED, GENERIC AND BRANDED GENERIC drugs among total drug sale. The final output should look like
Branded : 35%
Generic : 25%
Branded-Generic : 40%
Any help with a sas code which would do that is greatly appreciated thank you.
So you want a % sale split! You can try using SQL (proc sql) to get your desired answer.
proc sql;
create table want as
select drug_type, sum(total_sale) as tot_sale
from have
group by drug_type;
create table want as
select *, tot_sale/sum(tot_sale) as percent_sale format=percent10.2
from want;
quit;
I created a table 'want' that will have total sale for each drug type. Using that table, I created a column that has the calculated sale percentage and formatted it to a percent (for easy view).
Of course, there are other ways of doing it, like using proc summary or proc freq or even a data step. But as a beginner, I guess starting out with SQL would be a good decision.

Getting average price across stores and across months

I am trying to use the proc tabulate procedure to arrive at the average price of some configurable items, across stores and across months. Below is the sample data set, which I need to process
Configuration|Store_Postcode|Retail Price|month
163|SE1 2BN|455|1
320|SW12 9HD|545|1
23|E2 0RY|515|1
The below code is displaying the month wise average price for each configuration.
proc tabulate data=cs2.pos_data_raw;
class configuration store_postcode month;
var retail_price;
table configuration,month*MEAN*retail_price;
run;
But can I get this grouped one more level - at the Store Post code level? I modified the code to read as shown below, but executing this is crashing the system!
proc tabulate data=cs2.pos_data_raw;
class configuration store_postcode month;
var retail_price;
table configuration,store_postcode*month*MEAN*retail_price;
run;
Please advice if my approach is incorrect, or what am I doing wrong in proc tabulate so much so that it crashes the system.
I am not sure if this exactly answers your question since I am new to SAS, but when I switched store_postcode*month*MEAN*retail_price to month*store_postcode*MEAN*retail_price , it worked without crashing. I am just guessing that the reason for this is because your data only contains 1 value for month and multiple for postal code, therefore month is the most general level of categorization then it becomes more specific.
On a side note, I tried to format the table in another way also to segment the data by postal code:
proc tabulate data=pos_data_raw;
class configuration store_postcode month;
var retail_price;
table store_postcode*configuration, month*MEAN*retail_price;
run;
The output looks like this:
where the table will have postal code and configuration id on the left and month and retail price on top.

Fill in missing values with mode in SAS

I think the logic to replace missingness is quite clear but when I dump it to SAS I find it too complicated to start with.
Given no code was provided, I'll give you some rough directions to get you started, but put it on you to determine any specifics.
First, lets create a month column for the data and then calculate the modes for each key for each month. Additionally, lets put this new data in its own dataset.
data temp;
set original_data;
month = month(date);
run;
proc univariate data=temp modes;
var values;
id key month;
out=mode_data;
run;
However, this procedure calculates the mode in a very specific way that you may not want (defaults to the lowest in the case of a tie and produces no mode if nothing occurs at least twice) Documentation: http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_univariate_sect027.htm
If that doesn't work for you, I recommend using proc sql to get a count of each key, month, value combination and calculating your own mode from there.
proc sql;
create table mode_data as select distinct
key, month, value, count(*) as distinct_count
from temp
group by key, month, value;
quit;
From there you might want to create a table containing all months in the data.
proc sql;
create table all_months as select distinct month
from temp;
quit;
Don't forget to merge back in any missing months from to the mode data and use the lag or retain functions to search previous months for "old modes".
Then simply merge your fully populated mode data back to the the temp dataset we created above and impute the missing values to the mode when value is missing (i.e. value = .)
Hope that helps get you started.