why do i get this error ( Table "table" must be qualified with a dataset (e.g. dataset.table) when calling table in CTE
Did you run the query only select certain rows?
You can try to run all the rows or make sure the query that you block when running, doesn't connect with another query.
Related
I'm trying to create a very simple visualisation in Quicksight, and to do this I'm using an SQL query in Quicksight,
SELECT COUNT(distinct uuid), day
FROM analytics.myTable
GROUP BY day
Unfortunately, whenever I run this query in Quicksight it fails due to the following error
from the AWS Athena client. SYNTAX_ERROR: line 2:8: Column '_col0'
cannot be resolved
When I look in Athena, I can see that Quicksight is "nesting" the SQL query... this is what's causing the error in Athena,
/* QuickSight 4da449cf-ffc6-11e8-92ea-9ffafcc3adb3 */
SELECT "_col0"
FROM (SELECT COUNT(distinct uuid)
FROM pregnancy_analytics.final_test_parquet) AS "DAU"
What I don't understand is:
a) why this is flagging an error?
b) why Quicksight is nesting the SQL?
If I simply run the command directly in Athena,
SELECT COUNT(distinct uuid) FROM analytics.myTable
It does indeed show the column name "_col0",
_col0
1 1699174
so the fact that Quicksight is raising an error shouldn't actually be a problem.
Can someone offer some advice on how to resolve this issue?
Thanks
You can modify the query to explicitly name the aggregated column and then the query will work.
Example:
SELECT COUNT(distinct uuid) as "distinct_uuid", day
FROM analytics.myTable
GROUP BY day
Often in visualization software you will need to explicitly name your aggregate/function-wrapped columns as they default to things like _col0 which the software doesn't parse well, so it throws that error.
Specifically, I see this all the time in Superset using Presto.
For your problem explicitly, you should just do what Piotr recommended which is just adding a name after COUNT(distinct uuid) - I'm partial to freq, but it looks like you'll want something like uuid or unique_uuid :)
I am trying to drop all the partitions on an external table in a redshift cluster. I am unable to find an easy way to do it. I am currently doing this by running a dynamic query to select the dates from the table and concatenating it with the drop logic and taking the result set and running it separately like this
select 'ALTER TABLE procore_iad_ext.active_histories DROP PARTITION (values='''||rtrim(ltrim(values, '["'),'"]') ||''');' from svv_external_partitions
where tablename = 'xyz';
values looks like this ->["2009-03-10"]
Looking for a simpler direct solution. Thanks.
The easiest way to do this would be to drop the table itself. As long as you have the DDL to recreate the table and don't mind dropping all partitions, just DROP TABLE <schemaname>.<tablename>; then recreate the table. The new table will not have any partitions.
Please check out the Glue catalog. It provides a UI to easily delete the tables/partitions etc.
I'm trying to duplicate a Redshift table including modifiers.
I've tried using a CTAS statement and for some reason that fails to copy modifiers like not null
create table public.my_table as (select * from public.my_old_table limit 1);
There also doesn't seem to be a way to alter the table to add modifiers after creating the table which leads me to believe that there isn't a way to duplicate a Redshift table schema except by running the original create table statement vs the CTAS statement.
According to the docs you can do
CREATE TABLE my_table(LIKE my_old_table);
Does the "Create table as" function in SQL Data Warehouse create statistics in the background, or do they have to manually be created (as I would when I do a normal "Create table" statement?)
As of the current version, you always have to create column-level statistics on tables, irrespective of whether it was created with a normal CREATE TABLE or the CTAS CREATE TABLE AS... command. It's also good practice to create stats for columns used in JOINs, WHERE clauses, GROUP BY, ORDER BY and DISTINCT clauses.
Regarding tables created with CTAS, the database engine has a correct idea of how many rows are in the table as listed in sys.partitions, but not at the column-level statistics level. For tables created by CREATE TABLE this defaults to 1,000 rows. For the example below, the first table was created with a CTAS and has 208 rows, the second table with an ordinary CREATE TABLE and INSERT from the first table and also has 208 rows, but sys.partitions believes it to have 1,000 eg
Creating any column-level statistics manually will correct this number.
In summary, always manually create statistics against important columns irrespective of how the table was created.
I want to query a Dynamo DB table based on an attribute UpdateTime such that I get the records which are updated in the last 24 hours. But this attribute is not an index in the table. I understand that I need to make this column as an index. But I do not know how do I write a query expression for this.
I saw this question but the problem is I do not know the table name on which I want to query before runtime.
To find out the table names in your DynamoDB instance, you can use the "ListTables" API: http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_ListTables.html.
Another way to view tables and their data is via the DynamoDB Console: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ConsoleDynamoDB.html.
Once you know the table name, you can either create an index with the UpdateTime attribute as a key or scan the whole table to get the results you want. Keep in mind that scanning a table is a costly operation.
Alternatively you can create a DynamoDB Stream that captures all of the changes to your tables: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html.