In our production installation, WSO2 APIM 2.0 has been running for past 2 months, we noticed the size of WSO2METRICS_DB h2 database size is constantly increasing. Is currently 1.2 GiB.
Upon checking the content, noticed all data has not been cleaned up since the day it was deployed.
sql> SELECT COUNT(*) FROM METRIC_GAUGE ;
COUNT(*)
2756204
(1 row, 1 ms)
sql> SELECT MIN(TIMESTAMP) FROM METRIC_GAUGE ;
MIN(TIMESTAMP)
1476859611002 (GMT: Wed, 19 Oct 2016 06:46:51.002 GMT)
(1 row, 3 ms)
sql> SELECT COUNT(*) FROM METRIC_METER ;
COUNT(*)
280292
(1 row, 0 ms)
sql> SELECT MIN(TIMESTAMP) FROM METRIC_METER ;
MIN(TIMESTAMP)
1476860091000 (GMT: Wed, 19 Oct 2016 06:54:51 GMT)
(1 row, 2 ms)
sql> SELECT COUNT(*) FROM METRIC_TIMER ;
COUNT(*)
1118983
(1 row, 1 ms)
sql> SELECT MIN(TIMESTAMP) FROM METRIC_TIMER ;
MIN(TIMESTAMP)
1476859611002 (GMT: Wed, 19 Oct 2016 06:46:51.002 GMT)
(1 row, 2 ms)
sql> SELECT COUNT(*) FROM METRIC_COUNTER ;
COUNT(*)
0
(1 row, 0 ms)
sql> SELECT COUNT(*) FROM METRIC_HISTOGRAM ;
COUNT(*)
0
(1 row, 0 ms)
Can we safely delete old data by timestamp? (say older than a month).
Is there any clean-up task suppose to do this automatically in WSO2?
Thanks.
Related
I'm having some trouble with the logic behind a report and was hoping to get some help on how to capture more rows in a table with a date slicer.
I'll start by laying out the structure of my data and what I'm hoping to accomplish
I have a date column that I generated in SQL server. The whole report has to use direct query, so that would be one stipulation for possible solutions.
The date column just has dates like 1/1/2022, 1/2/2022, etc.
DATE
1 / 1 / 2022
1 / 2 / 2022
1 / 3 / 2022
1 / 4 / 2022
The data table is being filtered by a date slicer that uses the dates in the date column, as well as a DAX formula that gets selected values from then slicer.
Here is an example of the data table:
Begin Date
End Date
Data
2 / 30 / 2022
4 / 4 / 2022
0
3 / 6 / 2022
4 / 26 / 2022
0
4 / 7 / 2022
4 / 26 / 2022
0
4 / 30 / 2022
5 / 15 / 2022
0
In this instance, the table I'm filtering has two date columns that I need to filter by, hence the date table. For the data to appear, I'm filtering the slicer with 4-1-2022 to 4-30-2022. Ideally, If at least one date between Begin Date and End Date appears between the range given in the slicer with the date table, the row will appear.
Here is the code for what I have right now:
var range_end = LASTDATE('DATE'[Dates])
var range = DATESBETWEEN('DATE'[Dates], range_start, range_end)
return
if(
(CONTAINSROW(range, SELECTEDVALUE('DATA'[Begin Date])) = TRUE()
|| CONTAINSROW(range, SELECTEDVALUE('DATA'[End Date])) = TRUE()),
1,0)
This DAX is used as a filter on the data table, if it returns 1 for the row, it displays the row.
A problem occurs when I filter by a smaller date range, and I think it's because my code is not picking up as many values as I want it to.
For instance, if I filter the Date column down to 4/1/2022 to 4/2/2022, I return these possible rows:
Begin Date
End Date
Data
4 / 1 / 2022
5 / 13 / 2022
0
4 / 1 / 2022
4 / 26 / 2022
0
4 / 1 / 2022
4 / 26 / 2022
0
4 / 1 / 2022
5 / 15 / 2022
0
But if I expand the filter by a day (4/1/2022 to 4/3/2022) I get a new row:
Begin Date
End Date
Data
3 / 20 / 2022
4 / 3 / 2022
< New row
4 / 1 / 2022
5 / 13 / 2022
0
4 / 1 / 2022
4 / 26 / 2022
0
4 / 1 / 2022
4 / 26 / 2022
0
4 / 1 / 2022
5 / 15 / 2022
0
Ideally, the filter range of 4/1/2022 to 4/2/2022 would have picked up this row already, because it contains 4/2/2022 between it's begin and end date.
I think this is because my code is treating the selected dates as one value, where ideally they would look at all dates between the Begin Date and End Date columns, and see if any one of those dates exist within the specified date range, however my attempts at solving this with DAX have not worked to fix this issue.
Please let me know if there is a way to rewrite or remake my report or DAX to better accomplish the goal of displaying the correct rows. Let me know if I can clarify anything, ad I'll try to answer as best I can (without showing the report)
Thank you!
Can you try this measure?
Measure =
COUNTX (
NATURALINNERJOIN (
DATESBETWEEN (
dateTbl[DATE],
CALCULATE ( MAX ( data[Begin Date] ) ),
CALCULATE ( MAX ( data[End Date] ) )
),
ALLSELECTED ( dateTbl[DATE] )
),
[DATE]
)
COUNTX,NATURALINNERJOIN,DATESBETWEEN,CALCULATE,MAX,ALLSELECTED are supported in DQ.
Would you, please, help me to apply different calculations for 2 rows in power BI:
that is, to transform this table:
client_ids products purchased month
1 0 0 jan
2 1A 1 jan
2 1B 1 jan
3 0 0 jan
4 0 0 jan
5 0 0 feb
into this:
purchased jan feb
1 1
0 3 1
That is, to perform calculations:
-on purchased = 0 - count over month, client
-on purchased = 1 - count distinct over month, client
Thank you.
I used the method:
-create the reference to the main query in the query editor
-drop the column with products
-drop duplicates
But this makes downloading the report slower.
To return the expected output, you can use two steps to obtain the result from the data:
Assuming this is your table with date:
First, calculate the month different compared with today to find recently month (you can try other method depend on your data nature):
Mon Diff = (YEAR(NOW()) - YEAR(Sheet1[date])) + (MONTH(NOW()) - MONTH(Sheet1[date]))
Second, rank the recent month as current:
rank =
var ranking = RANKX(Sheet1,Sheet1[Mon Diff],,,Dense)
return
SWITCH(ranking,1,"prior",2,"current")
Third, generate distinct values from purchase column
Table = DISTINCT(Sheet1[purchased])
Fourth, calculate the frequencies of 0 & 1 in Prior Month, the same for Feb
Jan = CALCULATE(COUNT(Sheet1[rank]),Sheet1[rank]="prior",
Sheet1[purchased]=EARLIER('Table'[purchased]))
Feb = CALCULATE(COUNT(Sheet1[rank]),Sheet1[rank]="current",
Sheet1[purchased]=EARLIER('Table'[purchased]))
The New table for the infor (In Jan, purchase 2 has 2 occurrence instead of 1):
My Data looks something like this:
ContractID Start Date End Date
1 01.01.2020 23.03.2020
2 15.02.2020 29.07.2020
3 06.06.2020 null
The last contract would be still active. I have a DateTable with the Start Date as the active relationship.
I need the end result too look like this:
Date Active Contracts
Jan 1
Feb 2
Mar 2
Apr 1
May 1
Jun 2
How should the measure look like?
Thanks in advance!
Assuming you have a month in your date table
VAR currentMonth = SELECTEDVALUE(MyDataTable[Month value]) --needs to be a number 1 to 12
RETURN CALCULATE(COUNTROWS(MyDataTable),
ALL(DateTable),
MONTH(MyDataTable[Start Date]) >= currentMonth,
ISBLANK(MyDataTable[End Date]) || MONTH(MyDataTable[End Date]) <= currentMonth)
Hello and many thanks in advance for your answers and efforts to help newby users in this forum.
i have a sas table with the variables : ID, Year, Month, and Creation date.
What i desire is, per month and year and Creation date to keep only one ID.
My HAVE data is :
ID Year Month Date of creation
1 2019 1 a
1 2019 1 a
1 2019 1 b
1 2019 2 c
1 2019 3 d
1 2020 5 e
2 2019 1 a
2 2019 1 b
2 2019 3 c
3 2021 8 m
3 2021 9 k
My WANT data is
ID Year Month Date of creation
1 2019 1 a
1 2019 1 b
1 2019 2 c
1 2019 3 d
1 2020 5 e
2 2019 1 a
2 2019 1 b
2 2019 3 c
3 2021 8 m
3 2021 9 k
I tried nodup key but it removes ID's.
Your example seems to work fine with NODUPKEY option of PROC SORT. Perhaps you used the wrong BY variables?
data have;
input ID Year Month Creation $ ;
cards;
1 2019 1 a
1 2019 1 a
1 2019 1 b
1 2019 2 c
1 2019 3 d
1 2020 5 e
2 2019 1 a
2 2019 1 b
2 2019 3 c
3 2021 8 m
3 2021 9 k
;
proc sort data=have out=want nodupkey;
by id year month creation ;
run;
You can also use distinct clause from proc sql, it will remove duplicates based on all columns
proc sql;
create table want
as
select distinct * from have;
quit;
I'm trying to improve the processing time used via an already existing for-loop in a *.jsl file my classmates and I are using in our programming course using SAS. My question: is there a PROC or sequence of statements that exist that SAS offers that can replicate a search and match condition? Or a way to go through unsorted files without going line by line looking for matching condition(s)?
Our current scrip file is below:
if( roadNumber_Fuel[n]==roadNumber_TO[m] &
fuelDate[n]>=tripStart[m] & fuelDate[n]<=TripEnd[m],
newtripID[n] = tripID[m];
);
I have 2 sets of data simplified below.
DATA1:
ID1 Date1
1 May 1, 2012
2 Jun 4, 2013
3 Aug 5, 2013
..
.
&
DATA2:
ID2 Date2 Date3 TRIP_ID
1 Jan 1 2012 Feb 1 2012 9876
2 Sep 5 2013 Nov 3 2013 931
1 Dec 1 2012 Dec 3 2012 236
3 Mar 9 2013 May 3 2013 390
2 Jun 1 2013 Jun 9 2013 811
1 Apr 1 2012 May 5 2012 76
...
..
.
I need to check a lot of iterations but my goal is to have the code
check:
Data1.ID1 = Data2.ID2 AND (Date1 >Date2 and Date1 < Date3)
My desired output dataset woudld be
ID1 Date1 TRIP_ID
1 May 1, 2012 76
2 Jun 4, 2013 811
Thanks for any insight!
You can do range matches in two ways. First off, you can match using PROC SQL if you're familiar with SQL:
proc sql;
create tableC as
select * from table A
left join table B
on A.id=B.id and A.date > B.date1 and A.date < B.date2
;
quit;
Second, you can create a format. This is usually the faster option if it's possible to do this. This is tricky when you have IDs, but you can do it.
First, create a new variable, ID+date. Dates are numbers around 18,000-20,000, so multiply your ID by 100,000 and you're safe.
Second, create a dataset from the range dataset where START=lower date plus id*100,000, END=higher date + id*100,000, FMTNAME=some string that will become the format name (must start with A-Z or _ and have A-Z, _, digits only). LABEL is the value you want to retrieve (Trip_ID in the above example).
data b_fmts;
set b;
start=id*100000+date1;
end =id*100000+date2;
label=value_you_want_out;
fmtname='MYDATEF';
run;
Then use PROC FORMAT with CNTLIN=` option to import formats.
proc format cntlin=b_fmts;
quit;
Make sure your date ranges don't overlap - if they do this will fail.
Then you can use it easily:
data a_match;
set a;
trip_id=put(id*100000+date,MYDATEF.);
run;