Date ranges in where clause of a proc SQL statement

Date ranges in where clause of a proc SQL statement - sas

There is a large table containing among other fields the following:
ID, effective_date, Expiration_date.
expiration_date is datetime20. format, and can be NULL
I'm trying to extract rows that expire after Dec 31, 2014 or do not expire (NULL).
Adding the following where statement to the proc sql query gives me no results
where coalesce(datepart(expiration_date),input('31/Dec/2020',date11.))
> input('31/Dec/2014',date11.);
However, when I only select NULL expiration dates and add the following fields:
put(coalesce(datepart(expiration_date),input('31/Dec/2020',date11.)),date11.) as value,
put(input('31/Dec/2014',date11.),date11.) as threshold,
case when coalesce(datepart(expiration_date),input('31/Dec/2020',date11.)) > input('31/Dec/2014',date11.)
then 'pass' else 'fail' end as tag
It shows 'pass' under TAG and all the other fields are correct.
This is an effort to duplicate what I used in SQL Server
where isnull(expiration_date,'9999-12-31') > '2014-12-31'
Using SAS Enterprise Guide 7.1 and while trying to figure it out I've been using
proc sql inobs=100;`
What am I doing wrong ? Thank you.
Some Expiration Dates:
30OCT2015:00:00:00
30OCT2015:00:00:00
29OCT2015:00:00:00
30OCT2015:00:00:00

I would recommend using a date constant ("31DEC2014"d) rather than date functions, or else either use explicit passthrough or disable implicit passthrough. Date functions are challenging when going between databases and so avoiding them when possible is best.

Related

How to use query parameters in GCP BigQuery federated queries

I have a gcp based environment. I use standard SQL scripting in gcp BigQuery and federated query to cloudsql MySql. Federated query selects data from cloudsql mysql database. I need to select data from cloudsql mysql database based on condition that depends on data in BigQuery. I use variables in standard sql scriping in gcp bigquery to store the value that I select from bigquery. I want to value of this variable in the where clause of mysql query. See following example where I select a date from BigQuery and store it in a variable "BQ_LAST_DATETIME".
DECLARE BQ_LAST_DATETIME DATETIME
SET BQ_LAST_DATETIME = (select max(date_created) from bq_my_dataset.bq_my_table);
Since I am using bigquery federated query to read data out of cloudsql database (https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries) as shown below and I want to use value that I stored in the variable "BQ_LAST_DATETIME" in the mysql query where clause
SELECT * FROM EXTERNAL_QUERY("my-gcp-project.my-region.my-connection2-cloudsql", "select * from mysqlschema.mysql_table where where date_created = #BQ_LAST_DATETIME;" );
Please note that in above query I have used "#BQ_LAST_DATETIME" as a placeholder to show what I want to achieve. I am not sure if I can directly use bigquery scripting variable as query parameter in the "external" query part of federated query.
Any suggestions on how to achieve parametrization of external queries in federated query, or if you know how I could achieve effect similar to what my intent is?
I actually tried following as depicted . I used bigquery scripting variable as query parameter in the "external" query part of federated query. only nuance here is that since the I was dealing with dates I performed a cast and also since the date variable actually is treated as a string I formatted it back to date using mysql STR_TO_DATE as follows
DECLARE BQ_LAST_DATETIME DATETIME
SET BQ_LAST_DATETIME = (select max(date_created) from bq_my_dataset.bq_my_table);
SET BQ_LAST_DATE= CAST(BQ_LAST_DATETIME AS DATE);
SELECT * FROM EXTERNAL_QUERY("my-gcp-project.my-region.my-connection2-cloudsql", "select * from mysqlschema.mysql_table where where date_created = STR_TO_DATE(#BQ_LAST_DATE,'%Y-%m-%d') ;" );
While this query is accepted by parser it is NOT giving expected result.
Basically the value of the variable #BQ_LAST_DATE does not seem to get to MySQL query as expected.
Does anyone know what am I missing ?
Thanks a lot for your help

You can try EXECUTE IMMEDIATE:
DECLARE BQ_LAST_DATETIME STRING;
DECLARE DSQL STRING;
SET BQ_LAST_DATETIME = 'SELECT max(date_created) from bq_my_dataset.bq_my_table';
SET DSQL = '"select * from mysqlschema.mysql_table where date_created = (' || BQ_LAST_DATETIME || ')"';
EXECUTE IMMEDIATE 'SELECT * FROM EXTERNAL_QUERY("my-gcp-project.my-region.my-connection2-cloudsql",' || DSQL || ');'

CFSpreadsheet not formatting dates

I think I have read just about every post on this topic and none of the proposed solutions works in my case so here goes.
I am using CF9 (upgrade not an option) for this project. I query a date field from a MSSQL database and use spreadsheetAddRows() to put the results into a spreadsheet (xls or xlsx, same result either way).
The date shows in excel as 2020-05-11 00:00:00.0 and isn't recognised as a date so the date formatting doesn't work.
I have tried using SpreadsheetFormatColumn (s, { dataformat="d-mmm-yy" }, 2); but this doesn't format the date either and has the exact same result in excel.
I have tried many variations of selecting convert(varchar, datecolumn, 101) from the database but these always just end up as text fields in excel as well so again, no date formatting and they sort in the wrong order.
Can anyone tell me what the correct format for a date is for CFSpreadsheet so that excel actually recognises it as a date?

Need to apply filter on using OR condition

I have Dataset in PowerBI as below.
In PowerBI Desktop I want to filter records of below table based on condition described below.
ProjectName ReleaseDate UserReleaseDate
PROJ-1 12/09/2019 null
PROJ-2 null 02/02/2019
PROJ-3 07/07/2018 null
Date are in DD/MM/YYYY format.
I want to filter those records where
(ReleaseDate OR UserReleaseDate is IsInNextNYears(1))

You pretty much just do exactly what you described.
Table.SelectRows(YourTable, each (Date.IsInNextNYears([ReleaseDate], 1) or Date.IsInNextNYears([UserReleaseDate], 1)))
If you use any filter operation on a column in Power Query it will automatically create a Table.SelectRows step that you can edit to do what you want instead.

Adding LIMIT fixes "Invalid digit, Value N" error in Amazon Redshift. Why?

I have a standard listings table on Redshift table with all varchars (due to loading into database)
This query (simplified) gives me error:
with AL as (
select
L.price::int as price,
from listings L
where L.price <> 'NULL'
and L.listing_type <> 'NULL'
)
select price from AL
where price < 800
and the error:
-----------------------------------------------
error: Invalid digit, Value 'N', Pos 0, Type: Integer
code: 1207
context: NULL
query: 2422868
location: :0
process: query0_24 [pid=0]
-----------------------------------------------
If I remove the where price < 800 condition, the query returns just fine... but I need the where condition to be there.
I've also checked the number validity of the price field and all look good.
After playing around, this actually makes it work, and I can't quite explain why.
with AL as (
select
L.price::int as price,
from listings L
where L.price <> 'NULL'
and L.listing_type <> 'NULL'
limit 10000000000
)
select price from AL
where price < 800
Note that the table has far less records than the number stated in limit.
Can anyone (possibly from the Redshift engineer team) explain why this is the way it is? Possibly something to do with how the query plan being executed and parallelized?

I had query that could be expressed simply as:
SELECT TOP 10 field1, field2
FROM table1
INNER JOIN table2
ON table1.field3::int = table2.field3
ORDER BY table1.field1 DESC
Removing the explicit cast to ::int solved a similar error for me.
Meanwhile, postgresql locally requires the "::int" to work.
For what it's worth, my local postgresql version is
PostgreSQL 9.6.4 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit

Loading CSV data with NaN into AWS Redshift
I found this post while searching google but the above link had what I needed. I was importing a numeric column with value NaN, which is unsupported by redshift numeric.

DB2 The syntax of the string representation of a datetime value is incorrect

We have a staging table that's used to load raw data from our suppliers.
One column is used to capture a time-stamp but its data-type is varchar(265). Data's dirty: about 40% of the time, there is garbage data, otherwise time-stamp data like this
2011/11/15 20:58:48.041
I have to create a report that filters some dates/timestamps out that column but where I try to cast it, I get an error:
db2 => select cast(loadedon as timestamp) from automation
1
--------------------------
SQL0180N The syntax of the string representation of a datetime value is incorrect. SQLSTATE=22007
What do I need to do in order to parse/cast the timestamp string?

The string format for a DB2 timestamp is either:
'2002-10-20-12.00.00.000000'
or
'2002-10-20 12:00:00'
You have to get your date string in either of these formats.
Also DB2 runs on a 24 hour clock even though the output sometimes uses a 12 hour clock (AM / PM)
So '2002-10-20 14:49:50' For 2:49:50 PM
Or '2002-10-20 00:00:00' For midnight. Output would be 12:00:00 AM

It seems you have a lot of garbage data, so firt of all you should check if the data is a valid timestamp in the format you expect ('2011/11/15 20:58:48.041'). We could use a simple solution - just replace all digits with '0' and check the result format:
TRANSLATE(timestamp_column,'0','0123456789','0') = '0000/00/00 00:00:00.000'
If the format is the expected one, you should convert to DB2 timestamp. In DB2 for iSeries there is a build-in function since V6R1 TIMESTAMP_FORMAT. In your case it will look like that:
TIMESTAMP_FORMAT('2011/11/15 20:58:48.041','YYYY/MM/DD HH24:MI:SS.NNNNNN')
So the solution query combined should look something like that:
SELECT
CASE
WHEN TRANSLATE(timestamp_column,'0','0123456789','0') = '0000/00/00 00:00:00.000'
THEN TIMESTAMP_FORMAT(timestamp_column,'YYYY/MM/DD HH24:MI:SS.NNNNNN')
ELSE NULL
END
FROM
your_table_with_bad_data
EDIT
I just saw your comment that provider agreed to clean the data. You could use the solution provided to speed up the process and clean the data by yourself:
ALTER your_table_with_bad_data ADD COLUMN clean_timestamp TIMESTAMP DEFAULT NULL;
UPDATE your_table_with_bad_data
SET clean_timestamp =
CASE
WHEN TRANSLATE(timestamp_column,'0','0123456789','0') = '0000/00/00 00:00:00.000'
THEN TIMESTAMP_FORMAT(timestamp_column,'YYYY/MM/DD HH24:MI:SS.NNNNNN')
ELSE NULL
END;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js