Consider the following Teradata View named 'VIEW' which consists of Transactional data.
ATTR1 ATTR2 DATE1 DATE2 WEEK1 WEEK2 AMOUNT
A B 1/1/2019 1/8/2019 201901 201902 10
A B 12/26/2018 1/8/2019 201852 201902 20
A B 1/1/2019 1/15/2019 201901 201903 30
A B 1/8/2019 1/15/2019 201902 201903 30
DATE1 is a Posting Date and DATE2 is the clearing date of the transaction. WEEK1 and WEEK2 are the fiscal weeks of DATE1 and DATE2 respectively. ATTR are random attributes of the transaction. I need to report the transaction amounts by the 'week of' for the attributes.
For example for Week 201901 we would like to see the transactions amounts of posting dates of and before Week 201901 and the clearing dates after 201901. See code below.
select ATTR1,
ATTR2,
SUM(CASE WHEN WEEK2 > 201852 AND WEEK1 <= 201852 THEN AMOUNT END) AS AMT_201852,
SUM(CASE WHEN WEEK2 > 201901 AND WEEK1 <= 201901 THEN AMOUNT END) AS AMT_201901,
SUM(CASE WHEN WEEK2 > 201902 AND WEEK1 <= 201902 THEN AMOUNT END) AS AMT_201902,
FROM VIEW
GROUP BY 1,2
The result:
ATTR1 ATTR2 AMT_201852 AMT_201901 AMT_201902
A B 20 60 60
As the code above suggests, we are having to manually create columns for each week which we would like to avoid. Is there a way to dynamically create these columns as the weeks pass by? Or is there a better way to represent this?
In a report, using WEEK1 as a filter will filter out the earlier weeks (In case of WEEK1 as 201901, 201852 will be filtered out and lose the respective amounts). We eventually put this SQL into a PowerBI dashboard.
Thanks!
You'll have to mess around with this a bit, but it should get you going in the right direction.
Teradata has an extremely useful date view built in: sys_calendar.calendar
This, for example, will get you the past three weeks in the format you have them in your sample data (YYYYWW):
select
distinct week_of_year,
year_of_calendar,
(year_of_calendar * 100) + week_of_year as year_week
from
sys_calendar.calendar
where
calendar_date between (current_date - interval '21' day) and current_date
order by year_week
You should be able to replace current_date with a parameter to make this dynamic, and join this to your table.
Related
I was wondering if this is possible in Power BI, I am extremely new to this and I am trying to relate how a sql query can translate in to a power bi report.
SELECT
expiresDate,
Name,
Addr,
ValidFrom,
ValidTo,
ChildName,
ChildValidFrom,
ChildValidTo,
RecValidFrom,
RecValidTo
FROM Table
WHERE expiresDate Between <date1> and <date2>
AND <Date3> BETWEEN ValidFrom AND ValidTo
AND <Date3> BETWEEN ValidFrom AND ValidTo
AND <Date3> BETWEEN ValidFrom AND ValidTo
A brief explanation. The report is for 3 months in advance. So in August the report is for September <date1 = 01/09/2021) and October (date2 = 31/10/2021) data. However the data can change on a daily basis. So this depends on Date3 which could be any day in August.
I have created a table that is a calendar and has the additional columns that calculate the start and end dates from a particular date. I just can't work out how to relate this to the dataset which is the query without the WHERE. I would then want the filters to be able to determine the result. Ultimately as I have it at present a single date that will then get the dates from the start and end dates as described earlier. Or display by range using the latest iteration of the record to display.
For example, First part of table
expiresDate
AccNo
Name
Addr
ValidFrom
ValidTo
ChildName
2021-10-01
1
Robert
1 Here
2019-01-01
2021-08-16
Cheese
2021-10-01
1
Robert
1 Here
2019-01-01
2021-08-16
Rhubarb
2021-10-01
1
Bob
1 Here
2021-08-17
2020-08-23
Rhubarb
Second half of table
ChildValidFrom
ChildValidTo
RecValidFrom
RecValidTo
2019-01-01
2021-08-10
2019-19-01
2020-12-31
2021-08-11
2021-08-23
2021-01-01
2021-08-15
2021-08-11
2021-08-23
2021-08-16
2020-08-23
The table is a view which has squashed the data to unique records and when the changes occurred. The dataset is considerably lower, a record count from 10m to 54k.
The requirement is that all To - From dates are within the date specified. Either being a date in the calendar that is entered as a filter... or today.
The report would bring out all records that have an expiryDate greater than 1 calendar month of the date, and less than 3 calendar months. I am just using August dates for the example so this would be from the 01/09/2021 - 31/10/2021.
If I use date 2021-08-01.
In my example there are 3 results for AccNo 1, but Only 1 should be displayed.
If I use the date 2021-08-01 the first row would be displayed.
If I use the date 2021-08-12 the second row should displayed.
If I use the date 2021-08-23 the third row should displayed.
Because the date used should fall between the date range of all 3 criteria
ValidFrom - ChildValidTo
ChildValidFrom - ChildValidTo
RecValidFrom - RecValidTo
Any help would be greatly appreciated. This is extremely frustrating, but I can understand that if this is possibly that this would make a nice visual for the users to check through their data based on entering a date.
Many thanks
How do I go about calculating number of workdays in a MONTH based on a date in another column?
Example:
Column 1 - 2020-06-30
Column 2 (Calculated) - 22 (i.e number of workdays in the month of June Mon to Friday)
Does BQ have a WORKDAY function?
You can use below approach
create temp function workdays(input date) as ((
select count(*)
from unnest(generate_date_array(date_trunc(input, month), last_day(input, month ))) day
where not extract(dayofweek from day) in (1, 7)
));
select column1,
workdays(column1) as column2
from your_table
if applied to sample data in your question - output is
I am struggling to join two table without creating duplicate rows using proc sql ( not sure if any other method is more efficient).
Inner join is on: datepart(table1.date)=datepart(table2.date) AND tag=tag AND ID=ID
I think the problem is date and different names in table 1. By just looking that the table its clear that table1's row 1 should be joined with table 2's row 1 because the transaction started at 00:04 in table one and finished at 00:06 in table 2. I issue I am having is I cant join on dates with the timestamp so I am removing timestamps and because of that its creating duplicates.
Table1:
id tag date amount name_x
1 23 01JUL2018:00:04 12 smith ltd
1 23 01JUL2018:00:09 12 anna smith
table 2:
id tag ref amount date
1 23 19 12 01JUL2018:00:06:00
1 23 20 12 01JUL2018:00:10:00
Desired output:
id tag date amount name_x ref
1 23 01JUL2018 12 smith ltd 19
1 23 01JUL2018 12 anna smith 20
Appreciate your help.
Thanks!
You need to set a boundary for that datetime join. You are correct in why you are getting duplicates. I would guess the lower bound is the previous datetime, if it exists and the upper bound is this record's datetime.
As an aside, this is poor database design on someone's part...
Let's first sort table2 by id, tag, and date
proc sort data=table2 out=temp;
by id tag date;
run;
Now write a data step to add the previous date for unique id/tag combinations.
data temp;
set temp;
format low_date datetime20.
by id tag;
retain p_date;
if first.tag then
p_date = 0;
low_date = p_date;
p_date = date;
run;
Now update your join to use the date range.
proc sql noprint;
create table want as
select a.id, a.tag, a.date, a.amount, a.name_x, b.ref
from table1 as a
inner join
temp as b
on a.id = b.id
and a.tag = b.tag
and b.low_date < a.date <= b.date;
quit;
If my understanding is correct, you want to merge by ID, tag and the closest two date, it means that 01JUL2018:00:04 in table1 is the closest with 01JUL2018:00:06:00 in talbe2, and 01JUL2018:00:09 is with 01JUL2018:00:10:00, you could try this:
data table1;
input id tag date:datetime21. amount name_x $15.;
format date datetime21.;
cards;
1 23 01JUL2018:00:04 12 smith ltd
1 23 01JUL2018:00:09 12 anna smith
;
data table2;
input id tag ref amount date: datetime21.;
format date datetime21.;
cards;
1 23 19 12 01JUL2018:00:06:00
1 23 20 12 01JUL2018:00:10:00
;
proc sql;
select a.*,b.ref from table1 a inner join table2 b
on a.id=b.id and a.tag=b.tag
group by a.id,a.tag,a.date
having abs(a.date-b.date)=min(abs(a.date-b.date));
quit;
The date in the table is not one set,
Days in the days column and months in the month column and years in the year column
I have concatenated the columns and then put these concatenation in where clause and put the parameter I have made but I got no result
I assume you are querying a date dimension table, and you want to extract the record that matches a certain date.
Solution:
I created a dates table to match with,
data dates;
input key day month year ;
datalines;
1 19 2 2018
2 20 2 2018
3 21 2 2018
4 22 2 2018
;;;
run;
Output:
In the where clause I parse the date '20feb2018'd using day, month & year functions: in SAS you have to quote the dates in [''d]
proc sql;
select * from dates
/*if you want to match todays' date: replace '20feb2018'd with today()*/
where day('20feb2018'd)=day and month('20feb2018'd)=month and year('20feb2018'd)=year;
quit;
Output:
if you compare date from day month and year, then use mdy function in where clause as shown below. it is not totally clear what you are looking for.
proc sql;
select * from dates
where mdy(month,day, year) between '19feb2018'd and '21feb2018'd ;
For the data set below(actual one is several thousand row long) I would like SAS to aggregate the income daily (many income lines everyday per machine), weekly, monthly (start of week is Monday, Start of month is 01 in any given year) by the machine. Is there a straight forward code for this? Any help is appreciated.
MachineNo Date income
1 01Jan2012 1500
1 02Jan2012 2000
1 27Aug2012 300
2 02Jan2012 1200
2 15Jun2012 50
3 03Mar2012 1000
4 08Apr2012 500
proc expand and proc timeseries are excellent tools for accumulation and aggregation to different frequencies of series. You can combine both with by-group processing to convert to any time period that you need.
Step 1: Sort by MachineNo and Date
proc sort data=want;
by MachineNo Date;
run;
Step 2: Find the min/max end dates of your series for date alignment
The format=date9. statement is important. For whatever reason, some SAS/ETS and HPF procedures require date literals for certain arguments.
proc sql noprint;
select min(date) format=date9.,
max(date) format=date9.
into :min_date,
:max_date
from have;
quit;
Step 3: Align each MachineNo by start/end date, and accumulate days per MachineNo
The below code will get you aligned daily accumulation, remove duplicate days per machine, and set Income on any missing days to 0. This step will also guarantee that your series has equal time intervals per by-group, allowing you to run hierarchical time-series analyses without violating the equal-spaced interval assumption.
proc timeseries data=have
out=want_day;
by MachineNo;
id date interval=day
align=both
start="&min_date"d
end="&max_date"d;
var income / accumulate=total setmiss=0;
run;
Step 4: Aggregate aligned Daily to Weekly shifted by 1 day, Monthly
SAS time intervals are able to be both multiplied and shifted. Since the standard weekday starts on a Sunday, we want to shift by 1 day to have it start on a Monday.
Standard Week
2 3 4 5 6 7 1
Mon Tue Wed Thu Fri Sat Sun
Shifted
1 2 3 4 5 6 7
Mon Tue Wed Thu Fri Sat Sun
Intervals follow the format:
TimeInterval<Multiplier>.<Shift>
The standard shift interval is 1. For all intents and purposes, consider 1 as 0: 1 means it's unshifted. 2 means it's shifted by 1 period. Thus, for a week to start on a Monday, we want to use the interval Week.2.
proc expand data=want_day
out=want_week
from=day
to=week.2;
id date;
convert income / method=aggregate observed=total;
run;
Step 5: Convert Week to Month
proc expand data=want_week
out=want_month
from=week.2
to=month;
id date;
convert income / method=aggregate observed=total;
run;
In case you don't have a license for SAS/ETS here's another way.
For the monthly data you can format the date in a proc means output.
I think WeekW. starts on Monday but it may not be in a format you want, so you'll need to create a new variable for week first if you wanted to use this method.
proc means data=have nway noprint;
class machineno date;
format date monyy7.;
var income;
output out=want sum(income)=income;
run;