In Presto how to query a cloumn which no value - amazon-athena

I have a integer column which is empty. How to query that I tried
select * from table where target_status_code = null
but it did not return anything

Probably you should use is null to check the integer column which is empty:
select * from table where target_status_code is null

Related

How to excldue null values using REGEXP_SUBSTR

The following statement retrieve the value of sub tag msg_id from MISC column if the sub stag contain value like %PACS%.
SELECT REGEXP_SUBSTR(MISC, '(^|\s|;)msg_id = (.*?)\s*(;|$)',1,1,NULL,2) AS TRANS_REF FROM MISC_HEADER
WHERE MISC LIKE '%PACS%';
I notice the query return record with null value (without msg_id) as well. Any idea if can exclude those null records from the syntax of REGEXP_SUBSTR, without adding any where clause.
Sample data of MISC:
channel=atm ; phone=0123 ; msg_id=PACS00812 ; ustrd=U123
channel=pos; phone=9922; ustrd=U156
The second record without msg_id, so it need to be excluded.
This method does not use REGEXP so may not be suitable for you.
However, it does satisfy your requirement.
This takes your embedded list of msg_id, breaks it out to a row for each component for an ID (I've assumed you do have something uniquely identifies each record).
It then only returns the original row where one of the rows for the ID has 'PACS' in it.
WITH thedata
AS (SELECT 1 AS theid
, 'channel=atm ; phone=0123 ; msg_id=PACS00812 ; ustrd=U123'
AS msg_id
FROM DUAL
UNION ALL
SELECT 2, 'channel=pos; phone=9922; ustrd=U156' FROM DUAL)
, mylist
AS (SELECT theid, COLUMN_VALUE AS msg_component
FROM thedata
, XMLTABLE(('"' || REPLACE(msg_id, ';', '","') || '"')))
SELECT *
FROM thedata td
WHERE EXISTS
(SELECT 1
FROM mylist m
WHERE m.theid = td.theid
AND m.msg_component LIKE '%PACS%')
Thedata sub-query is simply to generate a couple of records and pretend to be your table. You could remove that and substitute your actual table name.
There are other ways to break up an embedded list including ones that use REGEXP, I just find the XMLTABLE method 'cleaner'.

How to fix SQL Error [306] [S0002]: The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator?

I have problem when using "=" (equal to operator) to compare sql server text data type.
This is my query look like
SELECT * FROM dbeplanningv3.dbo.usulan_dpr
WHERE CONVERT(VARCHAR, evaluasi) is null
or
trim(CONVERT(VARCHAR, evaluasi)) = '' ORDER BY [detail] ASC OFFSET 0 ROWS
FETCH NEXT 10 ROWS ONLY
as you see in my script above. I am doing casting data type with this CONVERT(VARCHAR, evaluasi)
but still not work and I get error SQL Error [306] [S0002]: The text, ntext, and image data types cannot be compared or sorted,
this is a part of my table structure
help me please
In your query:
SELECT * FROM dbeplanningv3.dbo.usulan_dpr
WHERE CONVERT(VARCHAR, evaluasi) is null
or
trim(CONVERT(VARCHAR, evaluasi)) = '' ORDER BY [detail] ASC OFFSET 0 ROWS
FETCH NEXT 10 ROWS ONLY
only ORDER BY [detail] could cause this error, so I will assume [detail] is of type text (this column isn't visible on your screenshot). To avoid the error, you should convert it to varchar(max):
SELECT * FROM dbeplanningv3.dbo.usulan_dpr
WHERE CONVERT(VARCHAR, evaluasi) is null
or
trim(CONVERT(VARCHAR, evaluasi)) = '' ORDER BY convert(varchar(max), [detail]) ASC OFFSET 0 ROWS
FETCH NEXT 10 ROWS ONLY
But the important question is why on SQL Server 2012 you still use text data type? You should convert these to varchar(max) and avoid casting them all the time.
Also, this cast CONVERT(VARCHAR, evaluasi) is null is pointless. You can check evaluasi is null directly.

CTAS vs INSERT/SELECT to empty columnar table on Azure SQL Data Warehouse

I am running a series of tests to understand the throughput per DWU. I have eight (8) scenarios varying the ETL approach (CTAS vs INSERT/SELECT), varying the input table type (heap vs columnar), and varying the output table type (heap vs columnar).
Unexpectedly, using a columnar input table, writing to a columnar output table, using either INSERT/SELECT or CTAS yielded the same throughput (8,100 rows per second per DWU).
Why would there not be some penalty associated with "full logging" of the INSERT/SELECT construct?
Givens:
DWU = 600
table 17 columns with 1.33B rows
target table empty
beforehand
INSERT/SELECT Script:
CREATE TABLE
etl_schema_name.fact_table_benchmark_testing
(
column_1 INTEGER NOT NULL
,column_2 INTEGER NOT NULL
,column_3 SMALLINT NOT NULL
,column_4 SMALLINT NOT NULL
,column_5 INTEGER NOT NULL
,column_6 DECIMAL(9,4) NOT NULL
,column_7 DECIMAL(9,2) NOT NULL
,column_8 SMALLINT NOT NULL
,column_9 CHAR(1) NOT NULL
,column_10 SMALLINT NOT NULL
,column_11 DECIMAL(9,2) NOT NULL
,column_12 DECIMAL(9,2) NOT NULL
,column_13 DECIMAL(9,2) NOT NULL
,column_14 DECIMAL(9,2) NOT NULL
,column_15 DECIMAL(9,2) NOT NULL
,column_16 DECIMAL(9,2) NOT NULL
,column_17 DECIMAL(9,2) NOT NULL
)
WITH
(
DISTRIBUTION = HASH ( column_2 )
)
;
GO
insert into
etl_schema_name.fact_table_benchmark_testing
(
column_1
,column_2
,column_3
,column_4
,column_5
,column_6
,column_7
,column_8
,column_9
,column_10
,column_11
,column_12
,column_13
,column_14
,column_15
,column_16
,column_17
)
select
column_1
,column_2
,column_3
,column_4
,column_5
,column_6
,column_7
,column_8
,column_9
,column_10
,column_11
,column_12
,column_13
,column_14
,column_15
,column_16
,column_17
FROM
production_schema_name.fact_table
;
GO
CTAS Script
CREATE TABLE
etl_schema_name.fact_table_benchmark_testing_2
WITH
(
DISTRIBUTION = HASH ( column_2 )
)
as
select
column_1
,column_2
,column_3
,column_4
,column_5
,column_6
,column_7
,column_8
,column_9
,column_10
,column_11
,column_12
,column_13
,column_14
,column_15
,column_16
,column_17
FROM
production_schema_name.fact_table
;
GO
INSERT...SELECT is not necessarily fully logged in SQLDW. Have you had a chance to review the following article?
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-develop-best-practices-transactions

SQLite starting rowid of 0

I'm trying to set up a SQLite table where the rowid starts from 0 instead of the default 1. The end goal is to be able to run the first INSERT statement and have it insert to rowid 0. Explicitly setting rowid to 0 for that first INSERT is not an option.
I've tried a few things related to AUTOINCREMENT but am not having any luck getting this to work cleanly. The only successful way I've found is to insert a row with rowid of -1 and then delete it later. This works but it's messy and I'd like to find a cleaner way of doing it. I am working in Python 2.7 with the built-in sqlite3 library.
The bottom-line question:
Is there a cleaner way to start rowid from 0 other than manually inserting a -1 value and then removing it later?
Some side information:
I found a similar question here and played with some AUTOINCREMENT settings: Set start value for AUTOINCREMENT in SQLite
The sqlite_sequence table doesn't seem to work with negative numbers. I used the following to test it:
import sqlite3
con = sqlite3.Connection('db.db')
cur = con.cursor()
cur.execute("CREATE TABLE test(id INTEGER PRIMARY KEY AUTOINCREMENT, val TEXT)")
cur.execute("INSERT INTO sqlite_sequence (name,seq) VALUES (?,?)", ('test',-1))
cur.execute("INSERT INTO test (val) VALUES (?)", ('testval',)) #becomes rowid 1
cur.execute("INSERT INTO test (val) VALUES (?)", ('testval',)) #becomes rowid 2
cur.execute("INSERT INTO test (val) VALUES (?)", ('testval',)) #becomes rowid 3
cur.execute("SELECT rowid, id, val FROM test")
print cur.fetchall()
With the -1 inserted into sqlite_sequence it should set the next rowid to 0, but it's using 1 instead. If sqlite_sequence is initialized to a positive number the rowids are as expected.
import sqlite3
con = sqlite3.Connection('db.db')
cur = con.cursor()
cur.execute("CREATE TABLE test(id INTEGER PRIMARY KEY AUTOINCREMENT, val TEXT)")
cur.execute("INSERT INTO sqlite_sequence (name,seq) VALUES (?,?)", ('test',10))
cur.execute("INSERT INTO test (val) VALUES (?)", ('testval',)) #becomes rowid 11
cur.execute("INSERT INTO test (val) VALUES (?)", ('testval',)) #becomes rowid 12
cur.execute("INSERT INTO test (val) VALUES (?)", ('testval',)) #becomes rowid 13
cur.execute("SELECT rowid, id, val FROM test")
print cur.fetchall()
Does auto-increment not support negative numbers like this? I couldn't find any mention of it in the SQLite documentation.
The documentation says that, with AUTOINCREMENT,
the ROWID chosen for the new row is at least one larger than the largest ROWID that has ever before existed in that same table.
So the algorithm looks not only at the value in the sqlite_sequence table, but also at the last row in the table, and uses the larger of these two values.
When the table is empty, the largest actual rowid is instead assumed to be zero. This is done so that the first inserted rowid becomes 1.
Therefore, the only way to generate a rowid less than one is to have another row already in the table.
I was using sqlite in java and unintentionally go the first row to have ROWID=0.
Here is the code
long dbid = getDbid();
ContentValues values = new ContentValues();
values.put(KEY_ROWID, dbId); // Line that cause ROWID == 0 when dbid == 0
values.put(KEY_KEY, key);
values.put(KEY_VALUE, value);
if (dbId == 0) {
dbId = dbase.insert(PREFERENCES_TABLE, null, values);
map_.put(key, new Pair<>(dbId, value));
}
else
dbase.update(PREFERENCES_TABLE, values, KEY_ROWID + "=" + dbId, null);
The intent was to have zero be an uninitialized value but insert was returning zero. I didn't want the first ROWID to equal zero so I removed the line and insert returned one.

Remove rows from SQL DB that appear in a Array

I Develop with MFC Visual C++ and Oracle SQL Server.
I have SQL table with: IDs, value and time, when the application insert a new row: some ID, some Value and time being inserted.
My goal is to delete rows of values that were changed between certain time. since the data that was inserted during that time has incorrect value.
Where is the catch ? I dont need to delete all the rows that were updated in that time period, only the rows with IDs that appear on a certain CArray.
I can go through each ID from CArray and execute a delete query to that certain ID in that time period (whether there is entry or not) - problem since i can have 150K IDs to iterate
on..
Thanks
DELETE FROM table-name WHERE id in (...)
transform your array into a tempTable with one column and then delete from your destiantion table where ID in (select Id from temptable)
Here is an example:
declare #RegionID varchar(50)
SET #RegionID = '853,834,16,467,841'
declare #S varchar(20)
if LEN(#RegionID) > 0 SET #RegionID = #RegionID + ','
CREATE TABLE #ARRAY(region_ID VARCHAR(20))
WHILE LEN(#RegionID) > 0 BEGIN
SELECT #S = LTRIM(SUBSTRING(#RegionID, 1, CHARINDEX(',', #RegionID) - 1))
INSERT INTO #ARRAY (region_ID) VALUES (#S)
SELECT #RegionID = SUBSTRING(#RegionID, CHARINDEX(',', #RegionID) + 1, LEN(#RegionID))
END
delete from from your_table
where regionID IN (select region_ID from #ARRAY)