Unable to join two tables from two different databases in Athena - amazon-web-services

I have 2 databases in Athena each with it's own table. I'm not sure how to join two tables.Contractinfo_2019 is a database and so is enrollmentinfo_2019 another database. I keep getting error :
"SYNTAX_ERROR: line 11:10: Table awsdatacatalog.enrollmentinfo_2019.contractinfo2019 does not exist
This query ran against the "enrollmentinfo_2019" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 1bbc3941-4fa1-40a0-87c1-eb093784c990."
SELECT a.*,
b.*
FROM
(SELECT contract_id,
plan_id,
organization_type,
plan_type,
organization_name,
plan_name,
parent_organization
FROM contractinfo2019) AS a
LEFT JOIN
(SELECT contract_number,
plan_id,
state,
county,
enrollment
FROM enrollmentinfo2019) AS b
ON a.contract_id=b.contract_number
AND a.plan_id=b.plan_id
Can someone please guide me how to join table's in Athena. I'm not sure what am i doing wrong here?

I would recommend re-writing the query using WITH
for example:
WITH a AS
(SELECT contract_id,
plan_id,
organization_type,
plan_type,
organization_name,
plan_name,
parent_organization
FROM Contractinfo_2019.contractinfo2019),
b as
(SELECT contract_number,
plan_id,
state,
county,
enrollment
FROM enrollmentinfo_2019.enrollmentinfo2019)
SELECT * FROM a
LEFT JOIN b ON a.contract_id=b.contract_number
AND a.plan_id=b.plan_id

You just need qualified table names.
Instead of:
FROM contractinfo2019
use this (assuming I got your database and table name right):
FROM contractinfo_2019.contractinfo2019

Related

Query for listing Datasets and Number of tables in Bigquery

So I'd like make a query that shows all the datasets from a project, and the number of tables in each one. My problem is with the number of tables.
Here is what I'm stuck with :
SELECT
smt.catalog_name as `Project`,
smt.schema_name as `DataSet`,
( SELECT
COUNT(*)
FROM ***DataSet***.INFORMATION_SCHEMA.TABLES
) as `nbTable`,
smt.creation_time,
smt.location
FROM
INFORMATION_SCHEMA.SCHEMATA smt
ORDER BY DataSet
The view INFORMATION_SCHEMA.SCHEMATA lists all the datasets from the project the query is executed, and the view INFORMATION_SCHEMA.TABLES lists all the tables from a given dataset.
The thing is that the view INFORMATION_SCHEMA.TABLES needs to have the dataset specified like this give the tables informations : dataset.INFORMATION_SCHEMA.TABLES
So what I need is to replace the *** DataSet*** by the one I got from the query itself (smt.schema_name).
I am not sure if I can do it with a sub query, but I don't really know how to manage to do it.
I hope I'm clear enough, thanks in advance if you can help.
You can do this using some procedural language as follows:
CREATE TEMP TABLE table_counts (dataset_id STRING, table_count INT64);
FOR record IN
(
SELECT
catalog_name as project_id,
schema_name as dataset_id
FROM `elzagales.INFORMATION_SCHEMA.SCHEMATA`
)
DO
EXECUTE IMMEDIATE
CONCAT("INSERT table_counts (dataset_id, table_count) SELECT table_schema as dataset_id, count(table_name) from ", record.dataset_id,".INFORMATION_SCHEMA.TABLES GROUP BY dataset_id");
END FOR;
SELECT * FROM table_counts;
This will return something like:

Unable to connect snowflake query to power bi - Syntax

I have this query in snowflake. The query works fine in snowflake, but when i am trying to connect it to Power Bi, I get the Native error query. The error usually pops up when there's a syntax error. I can't find any syntax error here.
Any help would be appreciated as why there's an error.
Error: Native Queries aren't supported by this value.
WITH POLICIES AS(
SELECT DISTINCT a.POLICY_NUMBER
,c.DST
,d.DOB
,b.ENROLLED_RPM
,b.RATED_STATE
,a.EVENT_TIMESTAMP
FROM PD_PRESENTATION.CUSTOMER.REQUEST_FLOW_EDGE_MOBILE_TIER as a
LEFT JOIN PD_ANALYTICS.SVOC.POLICY as b
ON a.POLICY_NUMBER = b.POLICY_NUMBER
LEFT JOIN PD_ANALYTICS.SVOC.POLICY_HAS_POLICYHOLDER_PERSON as c
ON b.ID = c.SRC
LEFT JOIN PD_ANALYTICS.SVOC.PERSON as d
ON d.ID = c.DST
WHERE a.USER_GROUP = 'Customer'
AND b.STATUS = 'InForce'
),
MaximumTime AS(
SELECT a.POLICY_NUMBER
,MAX(a.EVENT_TIMESTAMP) as MAXDATED
FROM POLICIES as a
GROUP BY a.POLICY_NUMBER
)
SELECT DISTINCT a.*
,b.DOB
,b.ENROLLED_RPM
,b.RATED_STATE
,c.PAPERLESSPOLICYSTATUS
,c.PARTIALPAPERLESSSTATUS
,c.PAYPLAN
,MAX(c.TENUREPOLICYYEARS) as TENURE
FROM MaximumTime as a
LEFT JOIN POLICIES as b
ON a.POLICY_NUMBER = b.POLICY_NUMBER
LEFT JOIN PD_POLICY_CONFORMED.PEAK.POLICY as c
ON a.POLICY_NUMBER = c.POLICY_NUMBER
GROUP BY a.POLICY_NUMBER
,a.MAXDATED
,b.DOB, b.ENROLLED_RPM
,b.RATED_STATE
,c.PAPERLESSPOLICYSTATUS
,c.PARTIALPAPERLESSSTATUS
,c.PAYPLAN
Based on googling I suspect that this is caused by the driver you are using (odbc).
If SQL is running fine in snowflake it means it's syntax is correct and there must be an error somewhere between powerbi and snowflake, rather than in your code.
You can try to execute your query and look at the query history in snowflake to check what is actually being executed on snowflake.
https://docs.snowflake.com/en/sql-reference/functions/query_history.html
SnowFlake & PowerBI "native queries aren't support by this value"
Maybe it is lowercase / uppercase issue as explained here:
https://community.powerbi.com/t5/Issues/Unable-to-query-case-sensitive-Snowflake-tables/idi-p/2028900
In debugging process I would advise you to pinpoint which part of query causes the error. It could be quotes you are using in first CTE, non uppercase table names, * character.

Redshift: How to list all users in a group

Getting the list of users belonging to a group in Redshift seems to be a fairly common task but I don't know how to interpret BLOB in grolist field.
I am literally getting "BLOB" in grolist field from TeamSQL. Not so sure this is specific to TeamSQL but I kind of remember thatI got a list of IDs there instead previously in other tool
This worked for me:
select usename
from pg_user , pg_group
where pg_user.usesysid = ANY(pg_group.grolist) and
pg_group.groname='<YOUR_GROUP_NAME>';
SELECT usename, groname
FROM pg_user, pg_group
WHERE pg_user.usesysid = ANY(pg_group.grolist)
AND pg_group.groname in (SELECT DISTINCT pg_group.groname from pg_group);
This will provide the usernames along with the respective groups.
this worked better for me:
SELECT
pu.usename,
pg.groname
FROM
pg_user pu
left join pg_group pg
on pu.usesysid = ANY(pg.grolist)
order by pu.usename

Getting table information for Redshift `stl_load_errors` errors

I am using Redshift COPY command to load data into Redshift table from S3. When something goes wrong, I typically get an error ERROR: Load into table 'example' failed. Check 'stl_load_errors' system table for details. I can always lookup stl_load_errors manually to get details. Now, I am trying to figure out how I can do that automatically.
From documentation it looks like the following query should give me all the details I need:
SELECT *
FROM stl_load_errors errors
INNER JOIN svv_table_info info
ON errors.tbl = info.table_id
AND info.schema = '<schema-name>'
AND info.table = '<table-name>'
However it always returns nothing. I also tried using stv_tbl_perm instead of svv_table_info, and still nothing.
After some troubleshooting, I see two things I don't understand:
I see multiple different IDs in stv_tbl_perm and svv_table_info for the same exact table. Why is that?
I see tbl filed on stl_load_errors referencing ids that do not exist in stv_tbl_perm or svv_table_info. Again why?
Feels like I don't understanding something in structure of these tables, but it completely escapes me what.
This is because tbl and table_id are with different types. First one is integer, second one is iod.
When you cast iod to integer the columns have the same values. You could check this query:
SELECT table_id::integer, table_id
FROM SVV_TABLE_INFO
I have result when I execute
SELECT errors.tbl, info.table_id::integer, info.table_id, *
FROM stl_load_errors errors
INNER JOIN svv_table_info info
ON errors.tbl = info.table_id
Please note that inner join is ON errors.tbl = info.table_id
I finally got to the bottom of it, and it is surprisingly boring and probably not useful to many ...
I had an existing table. My code that was creating the table was wrapped in transaction, and it was dropping the table inside the transaction. The code that was querying the stl_load_errors was outside the transaction. So the table_id outside and inside the transaction where different, as it was a different table.
You could try looking by filename. Doesn't really answer the question about joining the various tables, but I use a query like so to group up files that are part of the same manifest file and let me compare it to the maxerror setting:
select min(starttime) over (partition by substring(filename, 1, 53)) as starttime,
substring(filename, 1, 53) as filename, btrim(err_reason) as err_reason, count(*)
from stl_load_errors where filename like '%/some_s3_path/%'
group by starttime, filename, err_reason order by starttime desc;
This worked for me without any casting:
schemaz=# select i.database, e.err_code from stl_load_errors e join svv_table_info i on e.tbl=i.table_id limit 5
schemaz-# ;
database | err_code
-----------+----------
schemaz | 1204
schemaz | 1204
schemaz | 1204
schemaz | 1204
schemaz | 1204

update sql query using joins informix

I have one table table1 (id, name, surname, ssn) and a view1 (id, ssn) and here is my update clause
update table1 set
ssn=v.ssn
from table1 t,view v
where t.id=v.id
However I get syntax error sql code -201, does anybody knows what is the problem?
Can you try:
UPDATE table1 SET ssn=(SELECT ssn FROM view WHERE table1.id=view.id)
PS You use strange names: table1, view. They say nothing about data in those tables/views. I hope this is only for this question.
You can use the MERGE statement.
But this depends the version of the Informix engine are you working (needs version 11.50 for this answer work).
Check this other similar question/answer answer for more information.
MERGE INTO table1 as t1
USING table2 as t2
ON t1.ID = t2.ID
WHEN MATCHED THEN UPDATE set (t1.col1, t1.col2) = (t2.col1, t2.col2);