QuickSight could not generate any output column after applying transformation Error - amazon-web-services

I am running a query that works perfectly on AWS Athena however when I use athena as a data source from quicksight and tries to run query it keeps on giving me QuickSight could not generate any output column after applying transformation error message.
Here is my query:
WITH register as (
select created_at as register_time
, serial_number
, node_name
, node_visible_time_name
from table1
where type = 'register'),
bought as (
select created_at as bought_time
, node_name
, serial_number
from table1
where type= 'bought')
SELECT r.node_name
, r.serial_number
, r.register_time
, b.bought_time
, r.node_visible_time_name
FROM register r
LEFT JOIN bought b
ON r.serial_number = b.serial_number
AND r.node_name = b.node_name
AND b.bought_time between r.deploy_time and date(r.deploy_time + INTERVAL '1' DAY)
LIMIT 11;
I've did some search and found similar question Quicksight custom query postgresql functions In this case adding INTERVAL '1' DAY had the problem. I've tried other alternatives but no luck. Furthermore running query without it still outputs same error message.
No other lines seems to be getting transformed in any other way.

Re-creating dataset and running exact same query works.
I think queries that has been ran on existing dataset transforms the data. Please let me know if anyone knows why this is so.

Related

Column does not exist AWS Timestream Query error

I am trying to apply WHERE clause on DIMENSION of the AWS Timestream records. However, I got the error: Column does not exist
Here is my table schema:
The table schema
The table measure
First, I will show all the sample data I put in the table
SELECT username, time, manual_usage
FROM "meter-reading"."meter-metrics"
ORDER BY time DESC
LIMIT 4
The result:
Result
What I wanted to do is to query and filter the records by the Dimension ("username" specifically).
SELECT *
FROM "meter-reading"."meter-metrics"
WHERE measure_name = "OnceADay"
ORDER BY time DESC LIMIT 10
Then I got the Error: Column 'OnceADay' does not exist
I tried to search for any quotas for Dimensions name and check for error in my schema:
https://docs.aws.amazon.com/timestream/latest/developerguide/ts-limits.html#limits.naming
https://docs.aws.amazon.com/timestream/latest/developerguide/ts-limits.html#limits.system_identifier
But I didn't find that my "username" for the dimension violate any of the above rules.
I checked for some other queries by AWS Blog, the author used the WHERE clause for the Dimension filter normally:
https://aws.amazon.com/blogs/database/effective-queries-for-common-query-patterns-in-amazon-timestream/
I figured it out after I tried with the sample code. Turn out it was a silly mistake I believe.
Using apostrophe (') instead of single quotation marks ("") solved my problem.
SELECT *
FROM "meter-reading"."meter-metrics"
WHERE username = 'OnceADay'
ORDER BY time DESC LIMIT 10

Unable to connect snowflake query to power bi - Syntax

I have this query in snowflake. The query works fine in snowflake, but when i am trying to connect it to Power Bi, I get the Native error query. The error usually pops up when there's a syntax error. I can't find any syntax error here.
Any help would be appreciated as why there's an error.
Error: Native Queries aren't supported by this value.
WITH POLICIES AS(
SELECT DISTINCT a.POLICY_NUMBER
,c.DST
,d.DOB
,b.ENROLLED_RPM
,b.RATED_STATE
,a.EVENT_TIMESTAMP
FROM PD_PRESENTATION.CUSTOMER.REQUEST_FLOW_EDGE_MOBILE_TIER as a
LEFT JOIN PD_ANALYTICS.SVOC.POLICY as b
ON a.POLICY_NUMBER = b.POLICY_NUMBER
LEFT JOIN PD_ANALYTICS.SVOC.POLICY_HAS_POLICYHOLDER_PERSON as c
ON b.ID = c.SRC
LEFT JOIN PD_ANALYTICS.SVOC.PERSON as d
ON d.ID = c.DST
WHERE a.USER_GROUP = 'Customer'
AND b.STATUS = 'InForce'
),
MaximumTime AS(
SELECT a.POLICY_NUMBER
,MAX(a.EVENT_TIMESTAMP) as MAXDATED
FROM POLICIES as a
GROUP BY a.POLICY_NUMBER
)
SELECT DISTINCT a.*
,b.DOB
,b.ENROLLED_RPM
,b.RATED_STATE
,c.PAPERLESSPOLICYSTATUS
,c.PARTIALPAPERLESSSTATUS
,c.PAYPLAN
,MAX(c.TENUREPOLICYYEARS) as TENURE
FROM MaximumTime as a
LEFT JOIN POLICIES as b
ON a.POLICY_NUMBER = b.POLICY_NUMBER
LEFT JOIN PD_POLICY_CONFORMED.PEAK.POLICY as c
ON a.POLICY_NUMBER = c.POLICY_NUMBER
GROUP BY a.POLICY_NUMBER
,a.MAXDATED
,b.DOB, b.ENROLLED_RPM
,b.RATED_STATE
,c.PAPERLESSPOLICYSTATUS
,c.PARTIALPAPERLESSSTATUS
,c.PAYPLAN
Based on googling I suspect that this is caused by the driver you are using (odbc).
If SQL is running fine in snowflake it means it's syntax is correct and there must be an error somewhere between powerbi and snowflake, rather than in your code.
You can try to execute your query and look at the query history in snowflake to check what is actually being executed on snowflake.
https://docs.snowflake.com/en/sql-reference/functions/query_history.html
SnowFlake & PowerBI "native queries aren't support by this value"
Maybe it is lowercase / uppercase issue as explained here:
https://community.powerbi.com/t5/Issues/Unable-to-query-case-sensitive-Snowflake-tables/idi-p/2028900
In debugging process I would advise you to pinpoint which part of query causes the error. It could be quotes you are using in first CTE, non uppercase table names, * character.

Show all rows as default in calendar in Oracle Apex

I'm creating a report table type calendar where users can create back up by date select a filter that would filter out the table values depending on the user selected. (i.e. if they choose user1, then only back ups with user1 will show up)
I would like it to be when P106_BACK_UP_BY_USER = 0, the table shows all the values (aka getting rid of the "where" portion of the query.
Thank you for your help!
I'm having issues with trying to allow the user to see all the back ups of the table again (getting rid of the filtered value). My current query is this:
I would like it to be when P106_BACK_UP_BY_USER = 0, the table shows all the values (aka getting rid of the "where" portion of the query.
Thank you for your help!
You can use case when statements in your query's where condition as follows:
select *
from my_table
where my_table.created_by =
(select user_name from my_table2 where app_users_id =
case :P106_BACKUP_BY_USER when 0 then app_users_id
else :P106_BACKUP_BY_USER
end)
And for getting better help, please paste your code as text not as an image next time.
This should work too:
...
WHERE b.active_server = s.server_id
AND (:P106_BACK_UP_BY_USER = 0 OR
UPPER(b.created_by) =
(SELECT UPPER(user_name)
FROM eba_bt_app_users
WHERE app_users_id = :P106_BACK_UP_BY_USER
)
);

Spark SQL Union on Empty dataframe

We have glue job running script in which we have created two dataframes using createOrReplaceTempView command. The dataframes are then joined via a union:
df1.createOrReplaceTempView("df1")
df2.createOrReplaceTempView("df2")
joined_df = spark.sql(
"""
Select
id
, name
, product
, year
, month
, day
from df1
Union ALL
Select
id
, name
, product
, year
, month
, day
from df2
"""
)
joined_df.createOrReplaceTempView("joined_df")
Everything appeared to be working fine until it failed. Upon research, I am speculating it is because one of the dataframe is empty. This job runs daily and occasionally, one of the dataframe will not have data for the day (df2).
The error message returned in Cloudwatch log is not entirely clear:
pyspark.sql.utils.AnalysisException: u"cannot resolve '`id`' given input columns:
[]; line 22 pos 7;\n'Union\n:- Project [contract_id#0, agr_type_cd#1, ind_product_cd#2,
report_dt#3, effective_dt#4, agr_trns_cd#5, agreement_cnt#6, dth_bnft#7, admn_system_cd#8,
ind_lob_cd#9, cca_agreement_cd#10, year#11, month#12, day#13]\n: +- SubqueryAlias `df1`\n
'Project ['id, name#20,
'product, 'year, 'month, 'day]\n +- SubqueryAlias `df2`\n +- LogicalRDD false\n"
I'm seeking advice on how to resolve such an issue. Thank you.
UPDATE:
I'm fairly certain I know the issue, but unsure on how to solution. On days where the source file in s3 is "empty" no records but only has the header row, the read from catalog returns an empty set:
df2 = glueContext.create_dynamic_frame.from_catalog(
database = glue_database,
table_name = df2_table,
push_down_predicate = pred
)
df2 = df2.toDF()
df2.show()
The output:
++
||
++
++
Essentially, the from_catalog method is not reading the schema from Glue. I would expect that even without data, the header would be detected and a union would just return everything from df1. But, since I'm receiving a empty set without header, the union cannot occur because it's acting as though the "schema has changed" or non-existent... But, the underlying s3 files have not changed schema. This job triggers daily when source files have been loaded to S3. When there is data in the file, the union does not fail as the schema can be detected from_catalog. Is there a way to read the header even when no data returns?

Django: Distinct foreign keys

class Log:
project = ForeignKey(Project)
msg = CharField(...)
date = DateField(...)
I want to select the four most recent Log entries where each Log entry must have a unique project foreign key. I've tries the solutions on google search but none of them works and the django documentation isn't that very good for lookup..
I tried stuff like:
Log.objects.all().distinct('project')[:4]
Log.objects.values('project').distinct()[:4]
Log.objects.values_list('project').distinct('project')[:4]
But this either return nothing or Log entries of the same project..
Any help would be appreciated!
Queries don't work like that - either in Django's ORM or in the underlying SQL. If you want to get unique IDs, you can only query for the ID. So you'll need to do two queries to get the actual Log entries. Something like:
id_list = Log.objects.order_by('-date').values_list('project_id').distinct()[:4]
entries = Log.objects.filter(id__in=id_list)
Actually, you can get the project_ids in SQL. Assuming that you want the unique project ids for the four projects with the latest log entries, the SQL would look like this:
SELECT project_id, max(log.date) as max_date
FROM logs
GROUP BY project_id
ORDER BY max_date DESC LIMIT 4;
Now, you actually want all of the log information. In PostgreSQL 8.4 and later you can use windowing functions, but that doesn't work on other versions/databases, so I'll do it the more complex way:
SELECT logs.*
FROM logs JOIN (
SELECT project_id, max(log.date) as max_date
FROM logs
GROUP BY project_id
ORDER BY max_date DESC LIMIT 4 ) as latest
ON logs.project_id = latest.project_id
AND logs.date = latest.max_date;
Now, if you have access to windowing functions, it's a bit neater (I think anyway), and certainly faster to execute:
SELECT * FROM (
SELECT logs.field1, logs.field2, logs.field3, logs.date
rank() over ( partition by project_id
order by "date" DESC ) as dateorder
FROM logs ) as logsort
WHERE dateorder = 1
ORDER BY logs.date DESC LIMIT 1;
OK, maybe it's not easier to understand, but take my word for it, it runs worlds faster on a large database.
I'm not entirely sure how that translates to object syntax, though, or even if it does. Also, if you wanted to get other project data, you'd need to join against the projects table.
I know this is an old post, but in Django 2.0, I think you could just use:
Log.objects.values('project').distinct().order_by('project')[:4]
You need two querysets. The good thing is it still results in a single trip to the database (though there is a subquery involved).
latest_ids_per_project = Log.objects.values_list(
'project').annotate(latest=Max('date')).order_by(
'-latest').values_list('project')
log_objects = Log.objects.filter(
id__in=latest_ids_per_project[:4]).order_by('-date')
This looks a bit convoluted, but it actually results in a surprisingly compact query:
SELECT "log"."id",
"log"."project_id",
"log"."msg"
"log"."date"
FROM "log"
WHERE "log"."id" IN
(SELECT U0."id"
FROM "log" U0
GROUP BY U0."project_id"
ORDER BY MAX(U0."date") DESC
LIMIT 4)
ORDER BY "log"."date" DESC